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Mammalian Grainyhead transcription factors 

FIELD OF THE INVENTION 

5 The present invention relates generally to diagnostic and therapeutic agents. More 
particularly, the present invention provides mammalian transcription factors which 
function in the modulation of expression of genetic sequences. The present invention 
further provides nucleic acid molecules encoding the transcription factors as well as 
nucleic acid and/or proteinaceous molecules with which the transcription factors interact. 

10 The transcription factors of the present invention or molecules interacting with same may 
be used inter alia in the generation of a range of diagnostic and therapeutic agents for a 
range of conditions. Therapeutic agents include gene-expression modulating agents 
including sense and antisense molecules, ribozymes and RNAi - type molecules. The 
present invention further provides medical assessment systems including drug evaluation 

15 systems comprising genetically modified animals. 

BACKGROUND OF THE INVENTION 

Reference to any prior art in this specification is not, and should not be taken as, an 
20 acknowledgment or any form of suggestion that this prior art forms part of the common 
general knowledge in any country. 

Bibliographic details of references provided in the subject specification are listed at the end 
of the specification. 

25 

The increasing sophistication of recombinant DNA techniques has provided significant 
progress in understanding the mechanisms involved in regulating eukaryotic gene 
expression. This is greatly facilitating research and development in the plant, agricultural, 
medical and veterinary industries. Transcription factors are an important component in the 
30 control of gene expression. However, despite their importance, mammalian transcription 
factors have not been well investigated for their diagnostic and therapeutic potential. 
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RNA polymerases in eukaryotic cells cannot initiate transcription alone; before 
transcription can begin, they require interaction between transcription factors and the 
promoter. These factors assemble at the promoter and, via a series of steps, facilitate both 
5 the binding of RNA polymerase II to the promoter and its subsequent phosphorylation and 
release to initiate transcription. 

In addition to these general transcription factors, many thousands of transcription 
activators and/or negative regulators (inhibitors) exist, which control the process of 
10 initiation of gene transcription from great distances along the DNA. These factors 
influence the timing and extent of transcription of a particular gene. Indeed, they control 
whether and to what extent a particular gene is transcribed in a cell of a particular tissue 
type. Although most gene regulators identified to date have been found to be proteins, 
some transcription factors may also be RNA molecules. 

15 

In Drosophila, the transcription factor known as "Grainyhead" regulates key 
developmental process in the embryo and is encoded by the gene grainyhead. During 
development, Grainyhead is initially involved in dorsal/ventral and terminal patterning of 
the newly fertilized embryo through the formation of multi-protein complexes that repress 
20 transcription from the decapentaplegic, tailless and zerknuellt genes (Huang et aL 9 Genes 
Dev. 9: 3177-3189, 1995; Liaw et al. 9 Genes Dev. 9: 3163-3176, 1995). Later, grainyhead 
is predominantly expressed in the embryonic central nervous system in cuticle-producing 
tissues, where it binds to promoters and influences transcription from other 
developmentally regulated genes including engrailed, fushi tarazu and Ultrabithorax (Bray 
25 et al. 9 Genes Dev. 3: 1130-1145, 1989; Dynlacht et al 9 Genes Dev. 3: 1677-1688, 1989; 
Biggin and Tjian, Cell 53: 699-711, 1988; Soeller et al. 9 Genes Dev. 2: 68-81, 1988; 
Dynlacht et al. 9 Cell 56: 563-576, 1991; Attardi and Tjian, Genes Dev. 7: 1341-1353, 
1993; Uv et al. 9 Mol. Cell Biol. 14: 4020-4031, 1994). 
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The importance of grainyhead in Drosophila development is emphasised by the embryonic 
lethal phenotype observed in flies carrying mutations in this gene. The embryos have 
flimsy cuticles, grainy and discontinuous head skeletons and patchy tracheal tubes (Bray 
and Kafatos, Genes Dev. 5: 1672-1683, 1991). A neuroblast-specific isoform of the 
protein, arising from alternate splicing, has also been identified. A mutation that abolishes 
this isoform is pupal- and adult- lethal, and flies demonstrate uncoordinated movements 
(Uv et aL, Mol. Cell Biol 17: 6727-6735, 1997). 

Mammalian homologs of grainyhead have previously been proposed, including three 
genes designated CP2 f LBP-la and LBP-9. Studies have implicated them in a wide variety 
of cellular and developmental events including T cell proliferation, globin gene expression 
and steroid biosynthesis (Sueyoshi et ah, Mol. Cell Biol. 15: 4158-4166, 1995; Jane et aL, 
EMBOJ. 14: 97-105, 1995; Volker et aL, Genes Development 11: 1435-1446, 1997; Zhou 
et aL, Mol Cell Biol. 20: 7662-7672, 2000). However, in situ analyses of both CP2 and 
LBP-la reveal ubiquitous expression of both genes, unlike the highly restricted pattern 
observed with grainyhead in Drosophila (Bray et aL, 1989, supra; Dynlacht et aL, 1989, 
supra; Bray and Kafatos, 1991, supra; Ramamurthy et aL, J. BioL Chem. 276: 7836-7842, 
2001). It is concluded, therefore, that these genes are not close homologs of grainyhead. 

Abnormalities in mammalian transcription factor expression are considered to play a role 
in a number of different genetic disorders and birth defects such as spina bifida and 
anencephaly. There is therefore a need to identify mammalian transcription factors and in 
particular close mammalian homologs of Grainyhead and to use these to develop a range 
of diagnostic and therapeutic agents. 
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SUMMARY OF THE INVENTION 

Throughout this specification, unless the context requires otherwise, the word "comprise", 
or variations such as "comprises" or "comprising", will be understood to imply the 
5 inclusion of a stated element or integer or group of elements or integers but not the 
exclusion of any other element or integer or group of elements or integers. 

Nucleotide and amino acid sequences are referred to by a sequence identifier number (SEQ 
ID NO:). The SEQ ID NOs: correspond numerically to the sequence identifiers <400>1 
10 (SEQ ID NO:l), <400>2 (SEQ ID NO:2), etc. A sequence listing is provided at the end of 
the specification. A summary of the SEQ ID NOs is provided in Table 1 . 

Genetic sequences were studied which exhibited homology at the nucleotide and/or amino 
acid level to a Drosophila gene, the product of which, is involved in body patterning where 

15 a fine balance between activation and inhibition of gene expression is critical to the correct 
development of cells and tissues into functional organisms. A large number of different 
families of transcription factors play a critical role in ensuring that this balance is 
maintained during embryological development. One such transcription factor, cloned from 
Drosophila and well-characterized, is Grainyhead (hereinafter referred to by its 

20 abbreviation, GRH). GRH is encoded by the gene grainyhead (grh). The inventors 
observed that the identity of previously published putative grh mammalian homologs 
showed much more ubiquitous expression compared with the highly restricted pattern 
exhibited by Drosophila grh. Furthermore, sequence similarity between the proposed 
mammalian homologs and the Drosophila grh sequence was relatively low. In accordance 

25 with the present invention, true grh homologs were identified and derived from 
mammalian tissue such as human and mouse tissue. 

Accordingly, one aspect of the present invention provides an isolated nucleic acid 
molecule comprising a sequence of nucleotides encoding or complementary to a sequence 
30 encoding a mammalian homolog of Drosophila GRH. A mammalian homolog of GRH is 
referred to herein as M-GRH. The corresponding gene is referred to as M-grh. A M-grh is 
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deemed a homolog of Drosophila grh (D-grh). If it comprises a nucleotide sequence 
having 60% or greater similarity to the nucleotide sequence of D-grh after optimal 
alignment. Likewise, a M-GRH is so defined if it comprises an amino acid sequence 
having 60% or greater similarity to the amino acid sequence of Drosophila GDH (D- 

5 GRH). There are four isoforms of Drosophila grh designated D-grh PI, D-grh P2, D-grh 
P3 and D-grh P4. The nucleotide sequence encoding D-grh is set forth in SEQ ID NO: 17 
and SEQ ID NO:34, SEQ ID NO:36 and SEQ ID NO:38, respectively. Mammalian 
sequences encompassed by the present invention include those derived from tissues of 
mouse and human including, for example, mouse embryo, human fetal brain and placenta, 

10 and mouse and human kidney. Reference herein to Drosophila grp includes any or all of 
its isoforms P1-P4. 

The mammalian sequences identified by the present inventors show higher percentages of 
similarity to the D-grh sequence than the already identified mammalian sequences 

15 designated CP2, LBP-la and LBP-9. In accordance with the present invention, it is 
proposed that the M-grh homologs disclosed are "true" grh homologs relative to CP2, 
LBP-la and LBP-9. As a result of the analysis herein described, it is shown that the earlier 
sequences align phylogenetically with another distinct Drosophila factor, designated 
Drosophila CP2. A new family of transcription factors, highly conserved from Drosophila 

20 to human and having distinct tissue-specificity profiles, is now described in accordance 
with the present invention. 

The true M-grh homologs of the present invention include mammalian grainyhead (gene: 
mgr\ expression product: MGR), brother of mgr (gene: bom; expression product: BOM) 
25 and sister of mgr (gene som: protein: SOM). MGR has multiple isoforms including MGR 
p49 and MGR p70 in humans and MGR p61 in mice. A summary of the SEQ ID NOs for 
the M-grh and M-GRH molecules of the present invention are shown in Table 2. The 
sequences are provided in the Sequence Listing. The gene som and its product SOM are 
also referred to herein as grhl3 and GRHL3, respectively. 

30 



A. 
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The present invention provides, therefore, expression products of the M-grh genes, mgr, 
bom and som as well as derivatives and homologs thereof. This aspect of the present 
invention does not extend to CP2, LBP-la or LBP-9. 

5 Accordingly, another aspect of the present invention provides an isolated nucleic acid 
molecule comprising a sequence of nucleotides encoding a polypeptide comprising a 
predicted amino acid sequence substantially as set forth in SEQ ID NO:2 (human MGR 
p49), SEQ ID NO:4 (human MGR p70), SEQ ID NO:6 (human BOM), SEQ ID NO:8 
(human SOM), SEQ ID NO:10 (murine MGR p49), SEQ ID NO:12 (murine MGR p70), 

10 SEQ ID NO: 14 (murine BOM) or SEQ ID NO: 16 (murine SOM) or an amino acid 
sequence having at least about 60% similarity to SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16 
after optimal alignment 

15 The preferred nucleic acid molecules comprise sequences of nucleotides substantially as 
set forth in SEQ ID NO:l (human mgr p49), SEQ ID NO:3 (human mgr p70), SEQ ID 
NO:5 (human born), SEQ ID NO:7 (human som), SEQ ID NO:9 (murine mgr p61), SEQ 
ID NO: 11 (murine mgr p70), SEQ ID NO: 13 (murine bom) or SEQ ID NO: 15 (murine 
som) or complementary forms thereof, or a nucleotide sequence having at least about 60% 

20 similarity to SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, 
SEQ ID NO:ll, SEQ ID NO:13 or SEQ ID NO:15 after optimal alignment or their 
complementary forms or a nucleotide sequence capable of hybridizing to SEQ ID NO:l, 
SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID 
NO: 13 or SEQ ID NO: 15 or complementary forms thereof under low stringency 

25 conditions. Again, this aspect of the present invention does not extend to nucleic acid 
molecules encoding CP2, LBP-1 and LBP-9. 

The present invention further extends to recombinant forms of the M-GRH molecules. 
Preferred recombinant M-GRH molecules having amino acid sequences defined in 
30 parenthesis include human MGR p49 (SEQ ID NO:2), human MGR p70 (SEQ ID NO:4), 
human BOM (SEQ ID NO:6), human SOM (SEQ ID NO:8), murine MGR p61 (SEQ ID 
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NO:10), murine MGR p70 (SEQ ID NO: 12), murine BOM (SEQ ID NO: 14) and murine 
SOM(SEQIDNO:16). 

Reference to 'TVI-GRH" molecules include derivatives, homologs and analogs thereof. 

5 

The mammalian transcription factors of the present invention are proposed to be involved 
in the regulation of expression of a range of genes such as but not limited to 
developmentally regulated genes involved in determining patterning. Some of the genes 
regulated encode critical products, the absence or malfunctioning of which, is proposed to 

10 lead to unwanted phenotypes and/or predispositions to certain medical conditions. That is, 
the presence of a mutation in and/or malfunction of a M-grh including over or under 
expression of the transcription factors of the present invention are proposed to cause 
incorrect regulation of one or more of these genes thereby leading to an inappropriate 
phenotype. The ability to detect mutations in the nucleotide sequences encoding the M-grh 

15 homologs permits the detection of a range of abnormalities or a predisposition for 
development of abnormalities. Furthermore, as many of the genes will be developmentally 
regulated genes, identification of the transcription factors permits identification of 
unknown developmentally regulated genes. 

20 Accordingly, another aspect of the present invention contemplates a method for detecting a 
variation in a polynucleotide sequence encoding a M-GRH transcription factor. 

Furthermore, the isolated nucleic acid molecules of the present invention may be able to be 
used to correct such an abnormality in a subject in need thereof or at risk of developing an 
25 abnormality. The nucleic acid molecules of the present invention may be comprised, 
therefore, within a suitable vector for delivery of all or part of the sequence to a recipient 
cell or tissue. The nucleic acid molecule or part thereof could also be administered directly 
for transient expression. The present invention provides, therefore, the potential for both a 
diagnostic and a therapeutic capability. 
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Accordingly, a further aspect of the present invention contemplates a genetic construct 
comprising a nucleotide sequence selected from SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO:15 or a 
nucleotide sequence having at least 60% similarity to one or more of SEQ ID NO:l, SEQ 
5 ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13 or 
SEQ ID NO: 15 after optimal alignment or a nucleotide sequence capable of hybridizing to 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NO:ll, SEQ ID NO:13 or SEQ ID NO:15 or a complementary form thereof under low 
stringency conditions. 

10 

In a related embodiment, the present invention provides a genetic construct comprising a 
promoter or functional equivalent thereof operably linked to a nucleotide sequence of the 
invention. 

15 The present invention further provides animal models comprising genetically altered 
M-grh sequences including insertions, deletions, additions, and substitutions. Such animal 
models including genetically modified animals are useful in the development of medical 
assessment systems such as to monitor physiological changes and to evaluate drug targets 
and drug candidates. The medical assessment system may also be used in drug 

20 development. 

Examples of drugs or other therapeutic agents include genetic agents such as sense and 
antisense molecules, ribozymes, DNAzymes, methylakion- or demethylation- inducing 
agents as well as RNAi-type agents. Peptide mimetics and non-protenaceous chemical 
25 entities are also contemplated by the present invention. 

Genes are represented herein in lower case italics. Expression products (e.g. proteins or 
RNA) are represented in upper case, non-itallic letters. A summary of the genes and their 
expression products is provided in Table 1. The gene "som" or its expression product 
30 "SOM", are also referred to as grhl3 and GRHL3, respectively. 
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TABLE1 
Abbreviations 



GENE 


EXPRESSION PRODUCT 


grainyhead (grh) 


Grainyhead (GRH) 


mammalian grainyhead homologs (M-grh) 


mammalian grainyhead homologs (M-GRH) 


mammalian grainyhead (mgr) 


mammalian Grainyhead (MGR) 


brother of mammalian grainyhead (bom) 


brother of mammalian grainyhead (BOM) 


sister of mammalian grainyhead (som) 


sister of mammalian grainyhead (SOM) 



A summary of sequence identifiers used throughout the specification is Table 2. 



TABLE 2 
Summary of sequence identifiers 



SEQUENCE 
ID NO: 


NAME 


DESCRIPTION 


1 


human mgr p49 


Nucleotide sequence encoding mammalian 
grainyhead derived from human fetal brain 


2 


human MGRp49 


Predicted amino acid sequence corresponding to 
SEQIDNO:l 


3 


human mgr p70 


Nucleotide sequence encoding mammalian 
grainyhead being an isoform of SEQ ID NO:l, 
derived from human kidney 


4 


human MGR p70 


Predicted amino acid sequence corresponding to 
SEQIDNO:3 


5 


human bom 


Nucleotide sequence encoding mammalian 
grainyhead derived from human placenta 


6 


human BOM 


Predicted amino acid sequence corresponding to 
SEQIDNO:5 


7 


human som 


Nucleotide sequence encoding mammalian 
grainyhead 


8 


human SOM 


Predicted amino acid sequence corresponding to 
SEQ ID NO:7 


9 


murine mgr p61 


Nucleotide sequence encoding mammalian 
grainyhead derived from 17.5 day murine embryo 


10 


murine MGR p61 


Predicted amino acid sequence corresponding to 
SEQ ID NO:9 
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SEQUENCE 
ID NO: 


NAME 


DESCRIPTION 


11 


murine mgr p70 


Nucleotide sequence encoding mammalian 
grainyhead being an isoform of SEQ ID NO:9, 
derived from murine kidney 


12 


murine MGR p70 


Predicted amino acid sequence corresponding to 
SEQIDNO:ll 


13 


murine bom 


Nucleotide sequence encoding mammalian 
grainyhead derived from a murine embryonic 
carcinoma cell line (pi 9) 


14 


murine BOM 


Predicted amino acid sequence corresponding to 
SEQIDNO:13 


15 


murine som 


Nucleotide sequence encoding mammalian 
grainyhead 


16 


murine SOM 


Predicted amino acid sequence corresponding to 
SEQIDNO:15 


17 


grA-Pl 


Nucleotide sequence encoding the Drosophila 
transcription factor designated Grainyhead (grh) 


18 


GRH-P1 


Amino acid sequence corresponding to SEQ ID 
NO:18 


1 o on 


VlllTTI 5»T"l Q 1*7 CTV 
llUJLilCtll J^J*T.7 tflgl 


primers 


21-22 


human p70 mgr 


primers 


23-24 


human bom 


primers 


25-26 


murine p70 mgr 


primers 


27-28 


murine p61 mgr 


primers 


29-30 


murine bom 


primers 


31-32 


human S14 


primers 


33 


Drosophila dopa 
decarboxylase 


promoter 


34 


Drosophila PCNA 


promoter 


35 


human Engrailed- 1 


promoter 


36 


grh-P2 


Nucleotide sequence encoding the Drosophila 
transcription factor designated Grainyhead (grh) 
isoform P2 


37 


GRH-P2 


Amino acid sequence corresponding to SEQ ID 
NO:36 


38 


[ grh-V3 


Nucleotide sequence encoding the Drosophila 
transcription factor designated Grainyhead (grh) 
isoform P3 


39 


GRH-P3 


Amino acid sequence corresponding to SEQ ID 
NO:38 
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SEQUENCE 
ID NO: 


NAME 


DESCRIPTION 


40 


GRHL-3 


Primer 


41 


GRHL-3 


Primer 


42 


GRHL-3 


Primer 


43 


HPRT 


Primer 


44 


Antisense 


Primer 


45 


Exon 8 and Exon 
13 Sense 


Primer 


46 


Antisense 


Primer 
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BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 is a representation showing that mgr genomic locus encodes two distinct 
isoforms. (A) Alignment of the predicted NH 2 -terminal amino acid sequence of the p70 

5 isoform of MGR and BOM. Amino acid identity is denoted by shared upper case letters 
and similarity by the (+) symbol. The first amino acids shared between p61 MGR and p70 
MGR are given in bold. (B) Structure of the human and murine mgr genomic loci. Human 
genomic sequence was downloaded from the GenBank database (Accession Number 
AC010969) and aligned with cDNA sequences. Murine genomic clones were obtained 

10 from a 129 library and mapped by Southern analysis and PCR. Exons are denoted as El-8 
in human and El -9 in murine. The two human MGR isoforms are denoted as p70 and p49 
MGR and the two murine isoforms as p70 and p61 MGR. The scale of 1 kb is shown. (C) 
Identification of the murine p61 MGR promoter. Sequence was obtained from intron three 
from the MGR genomic locus and analyzed using the weight matrices of Bucher, J. Mol 

15 Biol 212: 563-578, 1990. The CAP site, TATA box and GC box are indicated. The cDNA 
start site is shown in arrows, the first ATG is given in bold and the splice site at the end of 
the first exon of p61 MGR is indicated. 

Figure 2 is a photographic representation showing that p70 MGR binds to Drosophila 
20 gene regulatory sequences which bind grh. (A) p70 MGR binds to the Drosophila PCNA 
promoter. Nuclear extract from the JEG-3 cell line was studied in an EMS A with a PCNA 
promoter probe in the presence and absence of anti-MGR specific antisera. Antisera 611 
was raised against peptides common to the p70 and p49 MGR proteins in the dimerization 
domain and antisera 67 was raised against unique peptides in the NH 2 -terminal domain of 
25 p70 MGR. The migration of the MGR complex is shown in arrows. (B) p70 MGR binds to 
the Drosophila dopo decarboxylase promoter. Experimental conditions were as described 
for (A). 

Figure 3 are representations showing that p70 MGR binds to and transactivates the human 
30 En-1 promoter. (A) Identification of a grh consensus DNA binding site in the human En-1 
promoter. The consensus sequence for grh DNA binding compiled from an alignment of 
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the Drosophila Ultrabithorax, Dopa decarboxylase and fushi tarazu promoters was 
compared with the sequence of the proximal human En-1 promoter and the Drosophila 
engrailed promoter. The closed bracket indicates the extend of the grainyhead binding site 
in the engrailed promoter as defined by DNAsel footprinting. (B) Human p70 MGR binds 

5 to the human En-1 promoter. Nuclear extract from the JEG-3 cell line was studied in an 
EMS A with a Ddc promoter probe in the presence of pre-immune sera (lane 1), anti-MGR 
specific antisera 67 (detailed in legend to Figure 2) (lane 2) or cold competitor DNA (lanes 
3-5). A 50-fold excess of the Ddc probe was used in lane 3 and a 10- and 20-fold excess of 
a human En-1 promoter probe in lanes 4 and 5, respectively. . The migration of the 
10 MGR/DNA complex is shown by arrows. (C) Human p70 MGR transactivates the En-1 
promoter. COS cells were transiently transfected with the proximal En-1 promoter 
containing the MGR binding site linked to a minimal y-globin promoter and a firefly 
luciferase reporter gene (solid columns), the minimal y-globin promoter/luciferase reporter 
gene (open columns) and the TK promoter linked to the Renilla luciferase reporter gene 

15 (hatched columns) in the presence and absence of a p70 MGR expression vector (PCI-p70 
MGR) as indicated. Transfection with the empty vector (pCI) served as the control. 
Luciferase levels were corrected for protein concentration and values were derived from 
two independent experiments performed in triplicate. 

20 Figure 4 is a photographic representation showing expression of GRHL-3 from E8.5 to 
E15.5. Sections of murine embryos were analysed by in situ hybridisation with a GRHL-3- 
specific 33 P-labelled antisense riboprobe. (A) Transverse section of an E8 embryo showing 
two discrete areas of intense expression in non-neuronal ectoderm (arrowed) adjacent to 
the folding neural plate (bottom panel). The section counter-stained with hematoxylin is 

25 shown (top panel), da, dorsal aorta; hd, hind-gut diverticula; ne, neural epithelium; se, 
surface ectoderm. (B) Transverse section of E8 embryo probed with the control sense 
riboprobe (bottom panel) and the hematoxylin counter-stain (top panel). (C,D) Saggital 
sections of E12.5 (C) and El 5.5 (D) embryos showing increasingly intense hybridisation to 
surface ectoderm (C and D). Hybridisation is also noted to other tissues lined by squamous 

30 epithelium including oral cavity, urogenital sinus and anal canal (D). ac, anal canal; dea, 
descending aorta; dv, ductus venosus; gt, genital tubercle; he, heart; li, liver; nc, nasal 
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cavity; np, nasal process; o, oral cavity; se, surface ectoderm; ta, tail; to, tongue; us, 
urogenital sinus. Signal from the descending aorta and ductus venosus is non-specific due 
to reflection from retained erythrocytes. 

5 Figure 5 is a representation showing the generation of a null allele of GRHL-3. (A) Gene- 
targeting strategy applied to the mouse GRHL-3 locus. The homologous recombination 
event deleted 2.2 kb of genomic DNA, including the region encoding the entire 
transcriptional activation domain of the protein. This was replaced with a promoter-less 
lacZ.polyA cassette fused to the second codon of exon 2 and a Neo R gene linked to a PGK 
10 promoter and flanked by loxP sites. The thymidine kinase gene driven off the MC1- 
promoter completed the targeting vector. The location of the 5' and 3' probes used for 
Southern blot analysis of the targeted allele and the size of the expected hybridisation 
fragments prior to excision of the Neo R cassette are shown. The Neo R cassette was excised 
by crossing mice heterozygous for the targeted allele with a transgenic line expressing the 
15 Cre recombinase gene driven off a CMV-promoter. LacZ.polyA, the lacZ gene linked to 
the rabbit p-globin polyadenylation signal; B, BamHZ; S, Spe/. (B) Southern blot analysis 
of two targeted ES cell clones (C7 and B12) and the parental ES cells (G7) with the 5* 
flanking probe demonstrating site-specific integration by homologous recombination. The 
size of DNA standards (in kb) is indicated. (C) Germ-line transmission of the targeted 
20 allele from cell line C7. Southern blotting was performed with the 3* flanking probe on tail 
DNA isolated from weaned progeny of GRHL-3 +/ " intercrosses. The size of DNA standards 
(in kb) is indicated. (D) PCR genotyping of embryos. Two allele, three primer PCR was 
performed on genomic DNA from El 8.5 embryos isolated from GRHL-3 +/ ~ intercrosses. 
The size of DNA standards (in bp) is indicated, target, PCR product diagnostic of targeted 
25 GRHL-3 allele; wt, PCR product diagnostic of wild type GRHL-3 allele. (E) Northern blot 
analysis of GRHL-3 mRNA expression in wild type embryos and embryos heterozygous or 
homozygous for the targeted GRHL-3 allele (upper panel). RNA integrity was confirmed 
with a GAPDH probe (lower panel). The size of RNA standards is indicated, as is the 
migration of the GRHL-3 and GAPDH transcripts. (F) RT-PCR of E9.5 GRHL-3" 7 " and 
30 GRHL-3 +/ ~ embryos was performed with primers specific for HPRT. Based on the HPRT 
quantitation, comparable amounts of cDNA from each embryo were PCR amplified for 30, 
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32 and 35 cycles (GRHL-3 +/ ") or 35, 38 and 40 cycles (GRHL-3 ''") with primers specific 
for GRHL-3. The 5* primer anneals to exon 8 and the 3 1 primer anneals to exon 13. Both 
primer pairs gave predicted size bands of 503 bp for GRHL-3 and 229 bp for HPRT. The 
identities of the amplified bands were confirmed by Southern blotting using gene-specific 
5 internal oligonucleotides. 

Figure 6 is a photographic representation of the phenotype of the GRHL-3-deficient mice. 
(A) El 8.5 littermate embryos, wild type (+/+) and deficient (-/-). The range of NTDs in the 
GRHL-3^" embryos are illustrated; exencephaly (arrow) and thoraco-lumbo-sacral spina 

10 bifida (arrowhead). Curled tails and growth retardation are also apparent in these embryos. 
A magnified view of the curly tail (ct) and the spina bifida from a caudal longitudinal view 
(civ) are inset. (B,C) Alizarin red/Alcian blue stained full-body skeletal preparations of 
El 8.5 littermates illustrating the kyphosis (k) and tail flexion deformity (ct) in (B), and the 
abnormal vertebral pedicles in the thoraco-lumbo-sacral regions of the GRHL-3" 7 " embryo 

15 in (B and C). np, normal pedicles; sp, splayed pedicles. (D) transverse sections through +/+ 
and -/- E8.5 to E14.5 embryos in the region of the caudal neural tube stained with 
hematoxylin and eosin. The open neural plate is arrowed. 

Figure 7 is a representation showing GRHL-3 and ct are the same gene. (A) Organization 
20 of the ct candidate region. Genetic map of the 13 Mb supercontig (Accession number 
NW_000213) that shows the positions of relevant markers (D4Mit69 and D4Mitl57) and 
previously excluded ct candidate genes (Synd3, Fgr, Hspg2, PaxT). The position of the 
GRHL-3 locus is also indicated. The size of the interval between the GRHL-3 locus and 
the D4Mit69 marker is shown. (B) Morphological appearance and genotype of embryos 
25 derived from ct/ct mice crossed with GRHL-3 +/ " mice. Embryo 1 is unremarkable; embryos 
2 and 3 display curly tails (arrowheads); embryos 4 and 5 display curly tails and lumbo- 
sacral spina bifida (arrows), ct, curly tail; SB, spina bifida. Scale bar = 10mm. (C) Total 
RNA from E14.5 embryos from curly tail {ct/ct\ wild type (+/+) and GRHL-3 
heterozygotes (+/-) were analysed for GRHL-3 expression by Northern blotting with a 
30 cDNA probe derived from the unique coding portion of the mRNA described in Fig 1A 
(upper panel). RNA loading was monitored by probing with 28S (lower panel). Signal 
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intensity was quantified by Phosphorimager densitometry and the individual embryo 
GRHL-3 signals corrected for 28S loading. The corrected signal intensities relative to wild 
type embryo 7 are shown. Positions of GRHL-3, 28S and the RNA size standards are 
indicated. (D) Quantitative real-time RT-PCR was performed on total RNA from E14.5 
curly tail (ct/ci), wild type (+/+) and GRHL-3 heterozygous (+/-) embryos. A standard 
curve was generated for HPRT and GRHL-3 and the relative quantity of both transcripts 
was calculated for individual embryos. Each reaction was performed in duplicate. The 
ratios of GRHL-3/HPRT in GRHL-3 +/ " and ct/ct embryos were normalised to the values 
obtained with GRHL-3 +/+ embryos. 
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DET AILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention is predicated in part on the identification of mammalian homologs of 
the Drosophila transcription factor known as Grainyhead (GRH). GRH is encoded by the 

5 gene, grainyhead (grh) In Drosophila, mutations in this gene are associated with 
embryonic lethal phenotypes, indicating the importance of the gene for normal 
development and function. The mammalian homologs are proposed to be involved in the 
regulation of developmental and/or non-developmental genes. Identification and isolation 
of the mammalian homologs of grh (M-grh) enable the development of a range of 

10 diagnostic and therapeutic agents useful in the detection and treatment of genetic disorders. 

The present invention provides, therefore, a family of mammalian-derived transcription 
factors, highly related from Drosophila to mammals. These transcription factors are more 
highly conserved than CP2, LBP-la and LBP-9. The present invention does not extend to 
15 CP2, LBP-1 and LBP-9. Reference to a mammal in this context includes a human, 
livestock animal (e.g. sheep, cow, horse, pig, donkey, goat), laboratory test animal (e.g. 
mouse, rat, rabbit, guinea pig), companion animal (e.g. dog, cat) or captive wild animal. 
Most preferably, the animal is a human or murine species. Sources of the isolated nucleic 
acid molecules include a range of tissues, such as mouse embryo, human fetal brain and 
20 placenta, and mouse and human kidney. In view of the highly conserved nature of this 
family of M-grh nucleotide sequences, however, corresponding homologs from other 
tissues and from other mammalian species are intended to be included within the scope of 
the present invention. The term "homolog" as used herein, therefore, extends to encompass 
transcription factors from mammalian species encoded by nucleotide sequences which 
25 have substantial similarity to Drosophilia grh or a conserved region thereof. At the protein 
level, a homolog includes an amino acid sequence and/or tertiary structure having 
similarity to Drosophila GRH. In cases where the expression product of the M-grh is 
RNA, a homolog is defined by reference to the similar ribonucleotide sequence to that 
encoded by Drosophila grh. 

30 
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M-ghd or M-GRH, i.e. a mammalian homolog of Drosophila grh or GRH is defined as 
such by having a nucleotide or amino acid sequence which has 60% or greater similarity 
after optimal alignment to Drosophila grh or GRH. 

5 Accordingly, one aspect of the present invention provides an isolated nucleic acid 
molecule comprising a sequence of nucleotides encoding or complementary to a sequence 
encoding a mammalian homolog of Drosophila grh. 

Reference to a mammalian homolog of Drosophila GRH (i.e. a M-GRH) preferably 
10 includes the mammalian homolog of grainyhead (MGR), brother of MGR (BOM) and 
sister of MGR (SOM). These transcription factors are encoded by mgr, bom and som 9 
respectively. Reference to "MGR", "BOM" and "SOM" or mgr, bom and som includes all 
mutants, derivatives, homologs and analogs thereof. The present invention further extends, 
however, to all novel mammalian homologs of Drosophila grh but does not encompass 
15 CP2, LBP-la or LBP-9. The nucleotide sequences for Drosophila grh are set forth in SEQ 
ID NO:17, SEQ ID NO:34, SEQ ID NO:36 and SEQ ID NO:38, respectively. 
Consequently, a mammalian homolog is defined herein as comprising a nucleotide 
sequence having at least about 60% sequence similarity to SEQ ID NO: 17 or SEQ ID 
NO:34 or SEQ ID NO:36 or SEQ ID NO:38 after optimal alignment and/or being capable 
20 of hybridizing to SEQ ID NO:17 or SEQ ID NO:34 or SEQ ID NO:36 or SEQ ID NO:38 
or its complementary form under low stringency conditions. 

Accordingly, another aspect of the present invention provides an isolated nucleic acid 
molecule encoding a mammalian transcription factor or a functional part thereof 
25 comprising a sequence of nucleotides having at least 60% similarity to SEQ ID NO: 17 or 
SEQ ID NO:34 or SEQ ID NO:36 or SEQ ID NO:38 after optimal alignment and/or being 
capable of hybridizing to SEQ ID NO: 17 or its complementary form under low stringency 
conditions. 

30 In a preferred embodiment, the isolated nucleic acid molecule encodes a proteinaceous 
form of a transcription factor. Examples of such mammalian protein transcription factors 



WO 2004/015108 




CT7AU2003/001006 



-19- 

include human MGR p49 (SEQ ID NO:2), human MGR p70 (SEQ ID NO:4), human 
BOM (SEQ ED NO:6), human SOM (SEQ ID NO:7), murine MGR p61 (SEQ ID NO: 10), 
murine MGR p70 (SEQ ID NO: 12), murine BOM (SEQ ID NO: 14) and murine SOM 
(SEQ ID NO: 16). 

5 

Accordingly, another aspect of the present invention is directed to an isolated nucleic acid 
molecule comprising a sequence of nucleotides encoding a polypeptide having 
transcription factor activity and comprising an amino acid sequence substantially as set 
forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ 
10 ID NO:12, SEQ ID NO:14 or SEQ ID NO:16 or an amino acid sequence having at least 
about 60% similarity to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ 
ID NO:10, SEQ ID NO:12, SEQ ID NO:14 or SEQ ID NO:16 after optimal alignment 
wherein said polypeptide is a mammalian homolog of Drosophia GRH. 

15 Such a polypeptide is referred to herein as a M-GRH. 

Preferred percentage amino acid similarity levels include at least about 61% or at least 
about 62% or at least about 63% or at least about 64% or at least about 65% or at least 
about 66% or at least about 67% or at least about 68% or at least about 69% or at least 

20 about 70% or at least about 71% or at least about 72% or at least about 73% or at least 
about 74% or at least about 75% or at least about 76% or at least about 77% or at least 
about 78% or at least about 79% or at least about 80% or at least about 81% or at least 
about 82% or at least about 83% or at least about 84% or at least about 85% or at least 
about 86% or at least about 87% or at least about 88% or at least about 89% or at least 

25 about 90% or at least about 91% or at least about 92% or at least about 93% or at least 
about 94% or at least about 95% or at least about 96% or at least about 97% or at least 
about 98% or at least about 99% similarity. 

This aspect of the present invention includes derivatives of M-GRH molecules. Such 
30 derivatives include non-active fragments which encompass inter alia the binding domain 
as well as active isoforms. 
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A "derivative" of a polypeptide of the present invention also encompasses a portion or a 
part of a full-length parent polypeptide, which retains the transcription factor activity of the 
parent polypeptide. Such tc biologically-active fragments" include deletion mutants and 

5 small peptides, for example, of at least 10, preferably at least 20 and more preferably at 
least 30 contiguous amino acids, which exhibit the requisite activity. Peptides of this type 
may be obtained through the application of standard recombinant nucleic acid techniques 
or synthesized using conventional liquid or solid phase synthesis techniques. For example, 
reference may be made to solution synthesis or solid phase synthesis as described, for 

10 example, in Chapter 9 entitled "Peptide Synthesis" by Atherton and Shephard which is 
included in a publication entitled "Synthetic Vaccines' 9 edited by Nicholson and published 
by Blackwell Scientific Publications. Alternatively, peptides can be produced by digestion 
of an amino acid sequence of the invention with proteinases such as endoLys-C, endoArg- 
C, endoGlu-C and staphylococcus V8-protease. The digested fragments can be purified by, 

15 for example, high performance liquid chromatographic (HPLC) techniques. Any such 
fragment, irrespective of its means of generation, is to be understood as being 
encompassed by the term "derivative" as used herein. 

La another embodiment, the present invention provides an isolated nucleic acid molecule 
20 encoding a mammalian transcription factor homolog of Drosophila grh (i.e. a M-GRH) 
and comprising a nucleotide sequence selected from SEQ ID NO:l, SEQ ID NO:3, SEQ 
ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13 and SEQ ID 
NO: 15 or a nucleotide sequence having at least about 60% similarity to any one of SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO.ll, SEQ 
25 ID NO:13 or SEQ ID NO:15 after optimal alignment or a nucleotide sequence capable of 
hybridizing to SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, 
SEQ ID NO:l 1, SEQ ID NO:13 or SEQ ID NO:15 or a complementary form thereof under 
low stringency conditions. 

30 Preferably, percentage nucleotide similarity levels include at least about 61% 61% or at 
least about 62% or at least about 63% or at least about 64% or at least about 65% or at 
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least about 66% or at least about 67% or at least about 68% or at least about 69% or at 
least about 70% or at least about 71% or at least about 72% or at least about 73% or at 
least about 74% or at least about 75% or at least about 76% or at least about 77% or at 
least about 78% or at least about 79% or at least about 80% or at least about 81% or at 
5 least about 82% or at least about 83% or at least about 84% or at least about 85% or at 
least about 86% or at least about 87% or at least about 88% or at least about 89% or at 
least about 90% or at least about 91% or at least about 92% or at least about 93% or at 
least about 94% or at least about 95% or at least about 96% or at least about 97% or at 
least about 98% or at least about 99% similarity. 

10 

The term "similarity" as used herein includes exact identity between compared sequences 
at the nucleotide or amino acid level. Where there is non-identity at the nucleotide level, 
"similarity" includes differences between sequences which result in different amino acids 
that are nevertheless related to each other at the structural, functional, biochemical and/or 
15 conformational levels. Where there is non-identity at the amino acid level, "similarity" 
includes amino acids that are nevertheless related to each other at the structural, functional, 
biochemical and/or conformational levels. In a particularly preferred embodiment, 
nucleotide and sequence comparisons are made at the level of identity rather than 
similarity. 

20 

Terms used to describe sequence relationships between two or more polynucleotides or 
polypeptides include "reference sequence", "comparison window", "sequence similarity", 
"sequence identity", "percentage of sequence similarity", "percentage of sequence 
identity", "substantially similar" and "substantial identity". A "reference sequence" is at 

25 least 12 but frequently 15 to 18 and often at least 25 or above, such as 30 monomer units, 
inclusive of nucleotides and amino acid residues, in length. Because two polynucleotides 
may each comprise (1) a sequence (i.e. only a portion of the complete polynucleotide 
sequence) that is similar between the two polynucleotides, and (2) a sequence that is 
divergent between the two polynucleotides, sequence comparisons between two (or more) 

30 polynucleotides are typically performed by comparing sequences of the two 
polynucleotides over a "comparison window" to identify and compare local regions of 
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sequence similarity. A "comparison window" refers to a conceptual segment of typically 
12 contiguous residues that is compared to a reference sequence. The comparison window 
may comprise additions or deletions (i.e. gaps) of about 20% or less as compared to the 
reference sequence (which does not comprise additions or deletions) for optimal alignment 

5 of the two sequences. Optimal alignment of sequences for aligning a comparison window 
may be conducted by computerized implementations of algorithms (GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics 
Computer Group, 575 Science Drive Madison, WI, USA) or by inspection and the best 
alignment (i.e. resulting in the highest percentage homology over the comparison window) 

10 generated by any of the various methods selected. Reference also may be made to the 
BLAST family of programs as for example disclosed by Altschul et ah (NucL Acids. Res. 
25: 3389, 1997). A detailed discussion of sequence analysis can be found in Unit 19.3 of 
Ausubel et ah (In: Current Protocols in Molecular Biology, John Wiley & Sons Inc. 1994-1998). 

15 The terms "sequence similarity" and "sequence identity" as used herein refers to the extent 
that sequences are identical or functionally or structurally similar on a nucleotide-by- 
nucleotide basis or an amino acid-by-amino acid basis over a window of comparison. 
Thus, a "percentage of sequence identity", for example, is calculated by comparing two 
optimally aligned sequences over the window of comparison, determining the number of 

20 positions at which the identical nucleic acid base (e.g. A, T, C, G, I) or the identical amino 
acid residue (e.g. Ala, Pro, Ser, Thr, Gly, Val, Leu, He, Phe, Tyr, Trp, Lys, Arg, His, Asp, 
Glu, Asn, Gin, Cys and Met) occurs in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in the 
window of comparison (i.e., the window size), and multiplying the result by 100 to yield 

25 the percentage of sequence identity. For the purposes of the present invention, "sequence 
identity" will be understood to mean the "match percentage" calculated by the DNASIS 
computer program (Version 2.5 for windows; available from Hitachi Software engineering 
Co., Ltd., South San Francisco, California, USA) using standard defaults as used in the 
reference manual accompanying the software. Similar comments apply in relation to 

30 sequence similarity. 
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The present invention provides, therefore, an isolated nucleic acid molecule comprising a 
sequence of nucleotides selected from SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ 
ID NO:7, SEQ ID NO:9, SEQ ID NO.ll, SEQ ID NO: 13 and SEQ ID NO: 15 or a 
complementary form thereof. Such nucleic acid molecules encode mammalian homologs 
of Drosophila grh. These mammalian homologs are proposed herein to be transcription 
factors. 

The present invention extends to variants of the nucleic acid molecules. A variant is a 
molecule having less than 100% sequence identity to a M-grh. Generally, a variant will 
still hybridize to a M-grh sequence under low stringency conditions. 

The term "variant" refers, therefore, to nucleotide sequences displaying substantial 
sequence identity with a reference nucleotide sequences or polynucleotides that hybridize 
with a reference sequence under stringency conditions that are defined hereinafter. The 
terms "nucleotide sequence", "polynucleotide" and "nucleic acid molecule" may be used 
herein interchangeably and encompass polynucleotides in which one or more nucleotides 
have been added or deleted, or replaced with different nucleotides. In this regard, it is well 
understood in the art that certain alterations inclusive of mutations, additions, deletions and 
substitutions can be made to a reference nucleotide sequence whereby the altered 
polynucleotide retains the biological function or activity of the reference polynucleotide. 
The term "variant" also includes naturally-occurring allelic variants. 

Reference herein to a low stringency includes and encompasses from at least about 0 to at 
least about 15% v/v formamide and from at least about 1 M to at least about 2 M salt for 
hybridization, and at least about 1 M to at least about 2 M salt for washing conditions. 
Generally, low stringency is at from about 25-30°C to about 42°C. The temperature may 
be altered and higher temperatures used to replace formamide and/or to give alternative 
stringency conditions. Alternative stringency conditions may be applied where necessary, 
such as medium stringency, which includes and encompasses from at least about 16% v/v 
to at least about 30% v/v formamide and from at least about 0.5 M to at least about 0.9 M 
salt for hybridization, and at least about 0.5 M to at least about 0.9 M salt for washing 
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conditions, or high stringency, which includes and encompasses from at least about 31% . 
v/v to at least about 50% v/v formamide and from at least about 0.01 M to at least about 
0.15 M salt for hybridization, and at least about 0.01 M to at least about 0.15 M salt for 
washing conditions. In general, washing is carried out T m = 69.3 + 0.41 (G+C)% (Marmur 

5 and Doty, J. Mol Biol 5: 109, 1962). However, the T m of a duplex DNA decreases by 1°C 
with every increase of 1% in the number of mismatch base pairs (Bonner and Laskey, Eur. 
J. Biochem. 46: 83, 1974). Formamide is optional in these hybridization conditions. 
Accordingly, particularly preferred levels of stringency are defined as follows: low 
stringency is 6 x SSC buffer, 0.1% w/v SDS at 25°-42°C; a moderate stringency is 2 x SSC 

10 buffer, 0.1% w/v SDS at a temperature in the range 20°C to 65°C; high stringency is 0.1 x 
SSC buffer, 0.1% w/v SDS at a temperature of at least 65°C. 

The present invention extends to recombinant forms of the M-grh molecules as well as 
derivatives and homologs thereof. 

15 

Accordingly, another aspect of the present invention provides an isolated polypeptide 
having transcription factor activity, said polypeptide comprising a sequence of amino acids 
encoded by a nucleotide sequence selected from SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NOrll, SEQ ID NO:13 or SEQ ID NO:15 or 

20 a nucleotide sequence having at least about 60% similarity to any one of SEQ ID NO:l, 
SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ED 
NO: 13 or SEQ ID NO: 15 or a nucleotide sequence capable of hybridizing to any one of 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NO:ll, SEQ ID NO: 13 or SEQ ID NO: 15 or a complementary form thereof under low 

25 stringency conditions. 

In a preferred embodiment, the present invention provides a recombinant M-grh 
comprising an amino acid sequence selected from SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ED NO:8, SEQ ED NO:10, SEQ ID NO:12, SEQ ED NO:14 or SEQ ED NO:16 
30 or an amino acid sequence having at least about 60% similarity to SEQ ED NO:2, SEQ ED 
NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14 or 
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SEQEDNO:16. 

This aspect of the present invention extends to derivatives, homologs and analogs of M- 
GRH molecules. 

5 

A "derivative" includes a mutant, fragment, part, portion or hybrid molecule. A derivative 
generally but not exclusively carries a single or multiple amino acid substitution, addition 
and/or deletion. 

10 A "homolog" includes an analogous polypeptide having at least about 60% similar amino 
acid sequence from another animal species or from a different locus within the same 
species. 

An "analog" is generally a chemical analog. Chemical analogs of the subject polypeptide 
15 contemplated herein include, but are not limited to, modification to side chains, 
incorporation of unnatural amino acids and/or their derivatives during peptide, polypeptide 
or protein synthesis and the use of crosslinkers and other methods which impose 
conformational constraints on the proteinaceous molecule or their analogs. 

20 Examples of side chain modifications contemplated by the present invention include 
modifications of amino groups such as by reductive alkylation by reaction with an 
aldehyde followed by reduction with NaBHV, amidination with methylacetimidate; 
acylation with acetic anhydride; carbamoylation of amino groups with cyanate; 
trinitrobenzylation of amino groups with 2, 4, 6-trinitrobenzene sulphonic acid (TNBS); 

25 acylation of amino groups with succinic anhydride and tetrahydrophthalic anhydride; and 
pyridoxylation of lysine with pyridoxal-5-phosphate followed by reduction with NaBH 4 . 

The guanidine group of arginine residues may be modified by the formation of 
heterocyclic condensation products with reagents such as 2,3-butanedione, phenylglyoxal 
30 and glyoxal. 
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The carboxyl group may be modified by carbodiimide activation via O-acylisourea 
formation followed by subsequent derivitization, for example, to a corresponding amide. 

Sulphydryl groups may be modified by methods such as carboxymethylation with 
5 iodoacetic acid or iodoacetamide; performic acid oxidation to cysteic acid; formation of a 
mixed disulphides with other thiol compounds; reaction with maleimide, maleic anhydride 
or other substituted maleimide; formation of mercurial derivatives using 4- 
chloromercuribenzoate, 4-chloromercuriphenylsulphonic acid, phenylmercury chloride, 2- 
chloromercuri-4-nitrophenol and other mercurials; carbamoylation with cyanate at alkaline 
10 pH. 

Tryptophan residues may be modified by, for example, oxidation with N- 
bromosuccinimide or alkylation of the indole ring with 2-hydroxy-5-nitrobenzyl bromide 
or sulphenyl halides. Tyrosine residues on the other hand, may be altered by nitration with 
1 5 tetranitromethane to form a 3-nitrotyrosine derivative. 

Modification of the imidazole ring of a histidine residue may be accomplished by 
alkylation with iodoacetic acid derivatives or N-carbethoxylation with 
diethylpyrocarbonate. 

20 

Examples of incorporating unnatural amino acids and derivatives during peptide synthesis 
include, but are not limited to, use of norleucine, 4-amino butyric acid, 4-amino-3- 
hydroxy-5-phenylpentanoic acid, 6-aminohexanoic acid, t-butylglycine, norvaline, 
phenylglycine, ornithine, sarcosine, 4-amino-3-hydroxy-6-methylheptanoic acid, 2-thienyl 
25 alanine and/or D-isomers of amino acids. A list of unnatural amino acid, contemplated 
herein is shown in Table 3. 
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Non-conventional 


Code 


Non-conventional 


Code 


5 


amino acid 




ammo acid 






a-aminobutyric acid 


Abu 


L-N-methylalanine 


Nmala 




a-amino-a-methylbutyrate 


Mgabu 


L-N-methylarginine 


Nmarg 




aminocyclopropane- 


Cpro 


L-N-methylasparagine 


Nmasn 


10 


carboxylate 




L-N-methylaspartic acid 


. Nmasp 




aminoisobutyric acid 


Aib 


L-N-methylcysteine 


Nmcys 




aminonorbornyl- 


Norb 


L-N-methylglutamine 


Nmgln 




carboxylate 




L-N-methylglutamic acid 


Nmglu 




cyclohexylalanine 


Chexa 


L-Nmethylhistidine 


Nmhis 


15 


cyclopentylalanine 


Cpen 


L-N-methylisolleucine 


Nmile 




D-alanine 


Dal 


L-N-methylleucine 


Nmleu 




D-arginine 


Darg 


L-N-methyllysine 


Nmlys 




D-aspartic acid 


Dasp 


L-N-methylmettoonine 


Nmmet 




D-cysteine 


Dcys 


L-N-methylnorleucine 


Nmnle 


20 


D-glutamine 


Dgln 


L-N-methylnorvaline 


Nmnva 




D-glutamic acid 


Dglu 


L-N-methylornithine 


Nmorn 




D-histidine 


Dhis 


L-N-methylphenylalanine 


Nmphe 




D-isoleucine 


Dile 


L-N-methylproline 


Nmpro 




D-leucine 


Dleu 


L-N-methylserine 


Nmser 


25 


D-lysine 


Dlys 


L-N-methylthreonine 


Nmthr 




D-methionine 


Dmet 


L-N-methyltryptophan 


Nmtrp 




D-ornithine 


Dorn 


L-N~methyltyrosine 


Nmtyr 




D-phenylalanine 


Dphe 


L-N-methylvaline 


Nmval 




D-proline 


Dpro 


L-N-methylethylglycine 


Nmetg 


30 


D-serine 


Dser 


L-N-methyl-t-butylglycine 


Nmtbug 




D-threonine 


Dthr 


L-norleucine 


Nle 
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D-tryptophan 


Dtrp 


L-norvaline 


Nva 


D-tyrosine 


Dtyr 


a-methyl-aminoisobutyrate 


Maib 


D-valine 


Dval 


a-methyl-y-aminobutyrate 


Mgabu 


D-a-methylalanine 


Dmala 


a-methylcyclohexylalanine 


Mchexa 


D-a-methylarginine 


Dmarg 


a-methylcylcopentylalanine 


Mcpen 


D-a-methylasparagine 


Dmasn 


a-methyl-a-napthylalanine 


Manap 


D-a-methylaspartate 


Dmasp 


a-methylpenicillamine 


Mpen 


D-a-methylcysteine 


Dmcys 


N-(4-aminobutyl) glycine 


Nglu 


D-a-methylglutamine 


Dmgln 


N-(2-aminoethyl)glycine 


Naeg 


D-a-methylhistidine 


Dmhis 


N-(3-aminopropyl)glycine 


Norn 


D-a-methylisoleucine 


Dmile 


N-amino-a-methylbutyrate 


Nmaabu 


D-a-methylleucine 


Dmleu 


a-napthylalanine 


Anap 


D-a-methyllysine 


Dmlys 


N-benzylglycine 


Nphe 


D-a-methylmethionine 


Dmmet 


N-(2-carbamylethyl)glycine 


Ngln 


D-a-methylornithine 


Dmorn 


N-(carbamylmethyl)glycine 


Nasn 


D-a-methylphenylalanine 


Dmphe 


N-(2-carboxyethyl)glycine 


Nglu 


D-a-methylproline 


Dmpro 


N-(carboxymethyl)glycine 


Nasp 


D-a-methylserine 


Dmser 


N-cyclobutylglycine 


Ncbut 


D-a-methylthreonine 


Dmthr 


N-cycloheptylglycine 


Nchep 


D-a-methyltryptophan 


Dmtrp 


N-cyclohexylglycine 


Nchex 


D-a-methyltyrosine 


Dmty 


N-cyclodecylglycine 


Ncdec 


D-a-methylvaline 


Dmval 


N-cylcododecylglycine 


Ncdod 


D-N-methylalanine 


Dnmala 


N-cyclooctylglycine 


Ncoct 


D-N-methylarginine 


Dnmarg 


N-cyclopropylglycine 


Ncpro 


D-N-methylasparagine 


Dnmasn 


N-cycloundecylglycine 


Ncund 


D-N-methylaspartate 


Dnmasp 


N-(2,2-diphenylethyl)glycine 


Nbhm 


D-N-methylcysteine 


Dnmcys 


N-(3,3-diphenylpropyl)glycine 


Nbhe 


D-N-methylglutamine 


Dnmgln 


N-(3-guanidinopropyl)glycine 


Narg 


D-N-methylglutamate 


Dnmglu 


N-( 1 -hydroxyethyl)glycine 


Nthr 


D-N-methylhistidine 


Dnmhis 


N-(hydroxyethyl)) glycine 


Nser 
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D-N-methylisoleucine 


Dnmile 


N-(imidazolylethyl))glycine 


JNnis 




D-N-methylleucine 


Dnmleu 


N-(3-indolylyethyl)glycine 


Nhtrp 




D-N-methyllysine 


Dnmlys 


N-methyl-y-aminobutyrate 


Nmgabu 




N-methylcyclohexylalanine 


Nmchexa 


D-N-methylmethionine 


Dnmmet 


5 


D-N-methylornithine 


Dnmorn 


N-methylcyclopentylalanine 


Nmcpen 




N-methylglycine 


Nala 


D-N-methylphenylalanine 


Dnmphe 




N-methylaminoisobutyrate 


Nmaib 


D-N-methylproline 


Dnmpro 




N-(l-methylpropyl)glycine 


Nile 


D-N-methylserine 


Dnmser 




N-(2-methylpropyl)glycine 


Meu 


D-N-methylthreonine 


Dnmthr 


10 


D-N-methyltryptophan 


Dnmtrp 


N-( 1 -methylethyl)glycine 


Nval 




D-N-methyltyrosine 


Dnmtyr 


N-methyla-napthylalanine 


Nmanap 




D-N-methylvaline 


Dnmval 


N-methylpenicillarnine 


Nmpen 




y-aminobutyric acid 


Gabu 


N-(p-hydroxyphenyl)glycine 


Nhtyr 




L-*-butylglycine 


Tbug 


N-(thiomethyl)glycine 


Ncys 


15 


L-ethylglycine 


Etg 


penicillamine 


Pen 




L-homophenylalanine 


Hphe 


L-a-methylalanine 


Mala 




L-a-methylarginine 


Marg 


L-a-methylasparagine 


Masn 




L-a-methylaspartate 


Masp 


L-a-methyl-^-butylglycine 


Mtbug 




L-ct-methylcysteine 


Mcys 


L-methylethylglycine 


Metg 


20 


L-a-methylglutamine 


Mgln 


L-a-methylglutamate 


Mglu 




L-a-methylhistidine 


Mhis 


■L-aTmethylhomophenylalanine 


Mhphe 




L-a-methylisoleucine 


Mile 


N-(2-methylthioethyl)glycine 


Nmet 




L-a-methylleucine 


Mleu 


L-a-methyllysine 


Mlys 




L-a-methylmethionine 


Mmet 


L-a-methylnorleucine 


Mnle 


25 


L-a-methylnorvaline 


Mnva 


L-a-methylomithine 


Morn 




L-a-methylphenylalanine 


Mphe 


L-a-methylproline 


Mpro 




L-oc-methylserine 


Mser 


L-a-methylthreonine 


Mthr 




L-a-methyltryptophan 


Mtrp 


L-a-methyltyrosine 


Mtyr 




L-a-methylvaline 


Mval 


L-N-methylhomophenylalanine 


Nmhphe 


30 


N-(N-(2,2-diphenylethyl) 


Nnbhm 


N-(N-(3 ,3-diphenylpropyl) 


Nnbhe 
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carbamylmethyl)glycine carbamylmethyl)glycine 

1 -carboxy- 1 -(2,2-diphenyl- Nmbc 

ethylamino)cyclopropane 

5 ~ ~ ~~ ~ 

Crosslinkers can be used, for example, to stabilize 3D conformations, using homo- 
bifunctional crosslinkers such as the Afunctional imido esters having (CH 2 ) n spacer groups 
with n=l to n=6, glutaraldehyde, N-hydroxysuccinimide esters and hetero-bifunctional 
reagents which usually contain an amino-reactive moiety such as N-hydroxysuccinimide 

10 and another group specific-reactive moiety such as maleimido or dithio moiety (SH) or 
carbodiimide (COOH). Li addition, peptides can be conformationally constrained by, for 
example, incorporation of C a and N ormethylamino acids, introduction of double bonds 
between C a and Cp atoms of amino acids and the formation of cyclic peptides or analogues 
by introducing covalent bonds such as forming an amide bond between the N and C 

1 5 termini, between two side chains or between a side chain and the N or C terminus. 

The present invention further contemplates chemical analogs of the subject polypeptide 
capable of acting as antagonists or agonists of M-GRH or which can act as functional 
analogs of M-GRH. Chemical analogs may not necessarily be derived from the instant M- 

20 GRH molecules but may share certain conformational similarities. Alternatively, chemical 
analogs may be specifically designed to mimic certain physiochemical properties of the 
subject M-GRH molecules. Chemical analogs may be chemically synthesized or may be 
detected following, for example, natural product screening. The latter refers to molecules 
identified from various environmental sources such a river beds, coral, plants, 

25 microorganisms and insects. 

These types of modifications may be important to stabilize the subject M-GRH molecules 
if administered to an individual or for use as a diagnostic reagent. 
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Other derivatives contemplated by the present invention include a range of glycosylation 
variants from a completely unglycosylated molecule to a modified glycosylated molecule. 
Altered glycosylation patterns may result from expression of recombinant molecules in 
different host cells. 

The designing of mimetics to a pharmaceutically active compound is a known approach to 
the developmerit of pharmaceuticals based on a "lead" compound. This might be desirable 
where the active compound is difficult or expensive to synthesize or where it is unsuitable 
for a particular method of administration, e.g. peptides are unsuitable active agents for oral 
compositions as they tend to he quickly degraded by proteases in the alimentary canal. 
Mimetic design, synthesis and testing is generally used to avoid randomly screening large 
numbers of molecules for a target property. 

There are several steps commonly taken in the design of a mimetic from a compound 
having a given target property. First, the particular parts of the compound that are critical 
and/or important in determining the target property are determined. In the case of a 
peptide, this can be done by systematically varying the amino acid residues in the peptide, 
e.g. by substituting each residue in turn. Alanine scans of peptides are commonly used to 
refine such peptide motifs. These parts or residues constituting the active region of the 
compound are known as its "pharmacophore". 

Once the pharmacophore has been found, its structure is modeled according to its physical 
properties, e.g. stereochemistry, bonding, size and/or charge, using data from a range of 
sources, e.g. spectroscopic techniques, x-ray diffraction data and NMR. Computational 
analysis, similarity mapping (which models the charge and/or volume of a pharmacophore, 
rather than the bonding between atoms) and other techniques can be used in this modeling 
process. 

In a variant of this approach, the three-dimensional structure of the ligand and its binding 
partner are modeled. This can be especially useful where the ligand and/or binding partner 
change conformation on binding, allowing the model to take account of this in the design 
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of the mimetic. Modeling can be used to generate inhibitors which interact with the linear 

sequence or a three-dimensional configuration. 

0 

A template molecule is then selected onto which chemical groups which mimic the 
5 pharmacophore can be grafted. The template molecule and the chemical groups grafted 
onto it can conveniently be selected so that the mimetic is easy to synthesize, is likely to be 
pharmacologically acceptable, and does not degrade in vzvo, while retaining the biological 
activity of the lead compound. Alternatively, where the mimetic is peptide-based, further 
stability can be achieved by cyclizing the peptide, increasing its rigidity. The mimetic or 
10 mimetics found by this approach can then be screened to see whether they have the target 
property, or to what extent they exhibit it. Further optimization or modification can then be 
carried out to arrive at one or more final mimetics for in vivo or clinical testing. 

The goal of rational drug design is to produce structural analogs of biologically active 
15 polypeptides of interest or of small molecules with which they interact (e.g. agonists, 
antagonists, inhibitors or enhancers) in order to fashion drugs which are, for example, 
more active or stable forms of the polypeptide, or which, e.g. enhance or interfere with the 
function of a polypeptide in vivo. See, e.g. Hodgson (BioTechnology 9: 19-21, 1991). In 
one approach, one first determines the three-dimensional structure of a protein of interest 
20 by x-ray crystallography, by computer modeling or most typically, by a combination of 
approaches. Useful information regarding the structure of a polypeptide may also be 
gained by modeling based on the structure of homologous proteins. An example of rational 
drug design is the development of HIV protease inhibitors (Erickson et aL, Science 249: 
527-533, 1990). In addition, target molecules may be analyzed by an alanine scan (Wells, 
25 Methods EnzymoL 202: 2699-2705, 1991). In this technique, an amino acid residue is 
replaced by Ala and its effect on the peptide's activity is determined. Each of the amino 
acid residues of the peptide is analyzed in this manner to determine the important regions 
of the peptide. 
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It is also possible to isolate a target-specific antibody, selected by a functional assay and 
then to solve its crystal structure. In principle, this approach yields a pharmacore upon 
which subsequent drug design can be based. It is possible to bypass protein crystallography 
altogether by generating anti-idiotypic antibodies (anti-ids) to a functional, 
5 pharmacologically active antibody. As a mirror image of a mirror image, the binding site 
of the anti-ids would be expected to be an analog of the original receptor. The anti-id could 
then be used to identify and isolate peptides from banks of chemically or biologically 
produced banks of peptides. Selected peptides would then act as the pharmacore. 

10 Two-hybrid screening is also useful in identifying other members of a biochemical or 
genetic pathway associated with a target. Two-hybrid screening conveniently uses 
Saccharomyces cerevisiae and Saccharomyces pombe. Target interactions and screens for 
inhibitors can be carried out using the yeast two-hybrid system, which takes advantage of 
transcriptional factors that are composed of two physically separable, functional domains. 

15 The most commonly used is the yeast GAL4 transcriptional activator consisting of a DNA 
binding domain and a transcriptional activation domain. Two different cloning vectors are 
used to generate separate fusions of the GAL4 domains to genes encoding potential 
binding proteins. The fusion proteins are co-expressed, targeted to the nucleus and if 
interactions occur, activation of a reporter gene (e.g. lacZ) produces a detectable 

20 phenotype. In the present case, for example, S. cerevisiae is co-transformed with a library 
or vector expressing a cDNA .GAL4 activation domain fusion and a vector expressing a 
holocyclotxin-GAL4 binding domain fusion. If lacZ is used as the reporter gene, co- 
expression of the fusion proteins will produce a blue color. Small molecules or other 
candidate compounds which interact with a target will result in loss of colour of the cells. 

25 Reference may be made to the yeast two-hybrid systems as disclosed by Munder et al 
(AppL Microbiol Biotechnol 52: 311-320, 1999) and Young et al (Nat. Biotechnol 16: 
946-950, 1998). Molecules thus identified by this system are then re-tested in animal cells. 

The present invention further contemplates methods of screening for drugs comprising, for 
30 example, contacting a candidate drug with a transcription factor. These molecules are 
referred to herein as "targets", "a target" or "target molecule". The screening procedure 
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includes assaying for the presence of a complex between the drug and the target. One form 
of assay involves competitive binding assays. In such competitive binding assays, the 
target is typically labeled. Free target is separated from any putative complex and the 
amount of free (i.e. uncomplexed) label is a measure of the binding of the agent being 
5 tested to target molecule. One may also measure the amount of bound, rather than free, 
target. It is also possible to label the compound rather than the target and to measure the 
amount of compound binding to target in the presence and in the absence of the drug being 
tested. 

10 Another technique for drug screening provides high throughput screening for compounds 
having suitable binding affinity to a target and is described in detail in Geysen 
(International Patent Publication No. WO 84/03564). Briefly stated, large numbers of 
different small peptide test compounds are synthesized on a solid substrate, such as plastic 
pins or some other surface. The peptide test compounds are reacted with a target and 

15 washed. Bound target molecule is then detected by methods well known in the art. This 
method may be adapted for screening for non-peptide, chemical entities. This aspect, 
therefore, extends to combinatorial approaches to screening for target antagonists or 
agonists. 

20 Purified target can be coated directly onto plates for use in the aforementioned drug 
screening techniques. However, non-neutralizing antibodies to the target may also be used 
to immobilize the target on the solid phase. 

The present invention also contemplates the use of competitive drug screening assays in 
25 which neutralizing antibodies capable of specifically binding the target compete with a test 
compound for binding to the target or fragments thereof. In this manner, the antibodies can 
be used to detect the presence of any peptide which shares one or more antigenic 
determinants of the target. 
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The present invention also provides a method for identifying a M-GRH, said method 
comprising screening a nucleotide database and identifying a nucleotide sequence having 
at least 60% similarity to SEQ ID NO: 17 or SEQ ID NO:34 or SEQ ID NO:36 or SEQ ID 
NO:38 after optimal alignment. 

5 

Reference to a "nucleotide database" includes screening an existing genomic or cDNA or 
mRNA database or screening for a target nucleic acid molecule in a mammalian cell such 
as using oligonucleotide probes or primers, sequencing the target molecule and comparing 
the sequence to SEQ ID NO:17 or SEQ ID NO:34 or SEQ ID NO:36 or SEQ ID NO:38. 

10 

In an alternative method, a database of mammalian protein sequences is screened for an 
amino acid sequence having at least 60% similarity to the amino acid sequence encoded by 
SEQ ID NO: 17 or SEQ ID NO:34 or SEQ ID NO:36 or SEQ ID NO:38. Again, a 
"database" includes a de novo protein sequence isolated and identified on a transcription 
1 5 factor isolated form a mammalian cell. 

In yet another alternative, a M-grh or its protein product is deemed one which has at least 
about 60% similarity at the nucleotide level to SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO.ll, SEQ ID NO:13 or SEQ ID NO:15 or 
20 at the amino acid level to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, 
SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14 or SEQ ID NO:15. 

Still yet another aspect of the present invention provides a means of identifying a 
nucleotide sequence likely to encode an M-GRH transcription factor, said method 

25 comprising interrogating a mammalian genome database conceptually translated into 
different reading frames with an amino acid sequence defining Drosophila GRH or any 
one of SEQ ID NO:2, SEQ ED NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ 
ID NO:12, SEQ ID NO:14 and SEQ ID NO:16 and identifying a nucleotide sequence 
corresponding to an amino acid sequence having at least about 60% similarity to 

30 Drosophila GRH or to any one of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ IDNO:10, SEQ ID NO:12, SEQ ID NO:14 and SEQ ID NO:16. 
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Preferably, the genome is conceptually translated into from about 3 to about 6 reading 
frames and more preferably six reading frames. 

5 It is proposed in accordance with the present invention that the M-GRH transcription 
factors are involved in the modulation of expression of a number of genes including 
developmental^ regulated genes. Accordingly, aberrations in the M-GRH or M-grh 
molecules are proposed to cause over or under expression of particular genes leading to a 
potentially unwanted phenotype. The phenotype may manifest itself pre- or post-natally. A 
10 pre-natal manifestation includes at the embryo or fetus stage. Conditions contemplated 
include developmentally-determined disease conditions such as poor brain development, 
poor muscle or bone development, aberrations in facial or cranial structures, malformed 
spinal structures, predispositions to a range of cancers including melanomas and 
immunological disorders. 

15 

Accordingly, another aspect of the present invention contemplates a method for detecting 
an aberrant phenotype or a propensity for an aberrant phenotype to develop, said method 
comprising screening for a variation in a nucleotide sequence encoding a mammalian 
MGR, BOM and/or SOM or their homologs. 

20 

Reference herein to "MGR", "BOM" and "SOM" includes murine and human forms of 
these molecules such as human MGR p49 (SEQ ID NO:2), human MGR p70 (SEQ ID 
NO:4), human BOM (SEQ ID NO:6), human SOM (SEQ ID NO:8), murine MGR p61 
(SEQ ID NO:10), murine MGR p70 (SEQ ID NO: 12), murine BOM (SEQ ID NO:14) and 
25 murine SOM (SEQ ID NO: 1 6). 

A homolog of MGR, BOM and SOM is as herein defined including a molecule having at 
least about 60% amino acid sequence similarity to MGR, BOM or SOM or at least about 
60% nucleic acid similarity to mgr, bom or soi7i or a nucleic acid molecule capable of 
30 hybridizing to the coding strands of mgr, bom or som or complementary forms thereof 
under low stringency conditions. 
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Aberrations may also be detectable at the amino acid level when the mammalian homologs 
of Drosophila grh encode protein transcription factors. 

Accordingly, another aspect of the present invention contemplates a method for detecting 
an aberrant phenotype or a propensity for an aberrant phenotype to develop, said method 
comprising screening for a variation in an amino acid sequence encoding MGR, BOM 
and/or SOM or their homologs. 

As above, reference to MGR, BOM and SOM include amino acid sequences defining 
human MGR p49 (SEQ ID NO:2), human MGR p70 (SEQ ID NO:4), human BOM SEQ 
ID NO:6), human SOM (SEQ ID NO:8), murine MGR p61 (SEQ ID NO: 10), murine 
MGR p70 (SEQ ID NO:12), murine BOM (SEQ ID NO:14) and murine SOM (SEQ ID 
NO: 16). 

As stated above, the mammalian transcription factors and their genetic sequences have a 
range of diagnostic and therapeutic utilities. The detection of an aberrant transcription 
factor or a nucleotide sequence encoding an aberrant transcription factor is indicative of a 
disease condition including a degenerative or developmental disease condition. 

Any number of methods may be employed to detect aberrant transcription factors or their 
genetic sequences. Immunological testing is one particular method. Accordingly, the 
present invention extends to antibodies and other immunological agents directed to or 
preferably specific for the mammalian transcription factors or a fragment thereof. The 
antibodies may be monoclonal or polyclonal or may comprise Fab fragments or synthetic 
forms. 

Specific antibodies can be used to screen for the subject mammalian transcription factors 
and/or their fragments. Techniques for the assays contemplated herein are known in the art 
and include, for example, sandwich assays and ELISA. 
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It is within the scope of this invention to include any second antibodies (monoclonal, 
polyclonal or fragments of antibodies or synthetic antibodies) directed to the first 
mentioned antibodies referred to above. Both the first and second antibodies may be used 
in detection assays or a first antibody may be used with a commercially available anti- 
immunoglobulin antibody. An antibody as contemplated herein includes any antibody 
specific to any region of the mammalian transcription factors. 

Both polyclonal and monoclonal antibodies are obtainable by immunization with 
mammalian transcription factors or antigenic fragments thereof and either type is utilizable 
for immunoassays. The methods of obtaining both types of sera are well known in the art. 
Polyclonal sera are less preferred but are relatively easily prepared by injection of a 
suitable laboratory animal with an effective amount of subject polypeptide, or antigenic 
parts thereof, collecting serum from the animal and isolating specific sera by any of the 
known immunoadsorbent techniques. Although antibodies produced by this method are 
utilizable in virtually any type of immunoassay, they are generally less favoured because 
of the potential heterogeneity of the product. 

The use of monoclonal antibodies in an immunoassay is particularly preferred because of 
the ability to produce them in large quantities and the homogeneity of the product. The 
preparation of hybridoma cell lines for monoclonal antibody production derived by fusing 
an immortal cell line and lymphocytes sensitized against the immunogenic preparation can 
be done by techniques which are well known to those who are skilled in the art. 

Another aspect of the present invention contemplates, therefore, a method for detecting a 
mammalian transcription factor or fragment thereof in a biological sample from a subject, 
said method comprising contacting said biological sample with an antibody specific for 
said mammalian transcription factor or fragment thereof or its derivatives or homologs for 
a time and under conditions sufficient for an antibody-polypeptide complex to form, and 
then detecting said complex. 
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A biological sample includes a cell extract. 

Reference to a "mammalian transcription factor" is considered to be a reference to a 
homolog of Drosophia grh 9 i.e. M-GRH. 

5 

The presence of the instant mammalian transcription factors or their fragments may be 
detected in a number of ways such as by Western blotting and ELISA procedures. A wide 
range of immunoassay techniques are available as can be seen by reference to U.S. Patent 
Nos. 4,016,043, 4,424,279 and 4,018,653. 

10 

Sandwich assays are among the most useful and commonly used assays and are favoured 
for use in the present invention. A number of variations of the sandwich assay technique 
exist, and all are intended to be encompassed by the present invention. Briefly, in a typical 
forward assay, an unlabeled antibody is immobilized on a solid substrate and the sample to 

15 be tested brought into contact with the bound molecule. After a suitable period of 
incubation, for a period of time sufficient to allow formation of an antibody-antigen 
complex, a second antibody specific to the antigen, labeled with a reporter molecule 
capable of producing a detectable signal is then added and incubated, allowing time 
sufficient for the formation of another complex of antibody-antigen-labeled antibody. Any 

20 unreacted material is washed away, and the presence of the antigen is determined by 
observation of a signal produced by the reporter molecule. The results may either be 
qualitative, by simple observation of the visible signal, or may be quantitated by 
comparing with a control sample containing known amounts of hapten. Variations on the 
forward assay include a simultaneous assay, in which both sample and labeled antibody are 

25 added simultaneously to the bound antibody. These techniques are well known to those 
skilled in the art, including any minor variations as will be readily apparent. In accordance 
with the present invention the sample is one which might contain a subject transcription 
factor including by tissue biopsy, blood, synovial fluid and/or lymph. The sample is, 
therefore, generally a biological sample comprising biological fluid. The transcription 

30 factor is likely to be in blood or other fluid in the case where cell apoptosis is occurring. 
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In the typical forward sandwich assay, a first antibody having specificity for the instant 
polypeptide or antigenic parts thereof, is either covalently or passively bound to a solid 
surface. The solid surface is typically glass or a polymer, the most commonly used 
polymers being cellulose, polyacrylamide, nylon, polystyrene, polyvinyl chloride or 

5 polypropylene. The solid supports may be in the form of tubes, beads, discs of microplates, 
or any other surface suitable for conducting an immunoassay. The binding processes are 
well-known in the art and generally consist of cross-linking covalently binding or 
physically adsorbing, the polymer-antibody complex is washed in preparation for the test 
sample. An aliquot of the sample to be tested is then added to the solid phase complex and 

10 incubated for a period of time sufficient (e.g. 2-40 minutes or where more convenient, 
overnight) and under suitable conditions (e.g. for about 20°C to about 40°C) to allow 
binding of any subunit present in the antibody. Following the incubation period, the 
antibody subunit solid phase is washed and dried and incubated with a second antibody 
specific for a portion of the hapten. The second antibody is linked to a reporter molecule 

15 which is used to indicate the binding of the second antibody to the hapten. 

An alternative method involves immobilizing the target molecules in the biological sample 
and then exposing the immobilized target to specific antibody which may or may not be 
labeled with a reporter molecule. Depending on the amount of target and the strength of 
20 the reporter molecule signal, a bound target may be detectable by direct labelling with the 
antibody. Alternatively, a second labeled antibody, specific to the first antibody is exposed 
to the target-first antibody complex to form a target-first antibody-second antibody tertiary 
complex. The complex is detected by the signal emitted by the reporter molecule. 

25 By "reporter molecule" as used in the present specification, is meant a molecule which, by 
its chemical nature, provides an analytically identifiable signal which allows the detection 
of antigen-bound antibody. Detection may be either qualitative or quantitative. The most 
commonly used reporter molecules in this type of assay are either enzymes, fluorophores 
or radionuclide containing molecules (i.e. radioisotopes) and chemiluminescent molecules. 

30 In the case of an enzyme immunoassay, an enzyme is conjugated to the second antibody, 
generally by means of glutaraldehyde or periodate. As will be readily recognized, however, 
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a wide variety of different conjugation techniques exist, which are readily available to the 
skilled artisan. Commonly used enzymes include horseradish peroxidase, glucose oxidase, 
beta-galactosidase and alkaline phosphatase, amongst others. The substrates to be used 
with the specific enzymes are generally chosen for the production, upon hydrolysis by the 

5 corresponding enzyme, of a detectable colour change. Examples of suitable enzymes 
include alkaline phosphatase and peroxidase. It is also possible to employ fluorogenic 
substrates, which yield a fluorescent product rather than the chromogenic substrates noted 
above. In all cases, the enzyme-labeled antibody is added to the first antibody hapten 
complex, allowed to bind, and then the excess reagent is washed away. A solution 

10 containing the appropriate substrate is then added to the complex of antibody- antigen- 
antibody. The substrate will react with the enzyme linked to the second antibody, giving a 
qualitative visual signal, which may be further quantitated, usually spectrophotometrically, 
to give an indication of the amount of hapten which was present in the sample. "Reporter 
molecule" also extends to use of cell agglutination or inhibition of agglutination such as 

15 red blood cells on latex beads, and the like. 

Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be 
chemically coupled to antibodies without altering their binding capacity. When activated 
by illumination with light of a particular wavelength, the fluorochrome-labeled antibody 

20 adsorbs the light energy, inducing a state to excitability in the molecule, followed by 
emission of the light at a characteristic colour visually detectable with a light microscope. 
The fluorescent labeled antibody is allowed to bind to the first antibody-hapten complex. 
After washing off the unbound reagent, the remaining tertiary complex is then exposed to 
the light of the appropriate wavelength the fluorescence observed indicates the presence of 

25 the hapten of interest. Immunofluorescene and EIA techniques are both very well 
established in the art and are particularly preferred for the present method. However, other 
reporter molecules, such as radioisotope, chemiluminescent or bioluminescent molecules, 
may also be employed. 



WO 2004/015108 



CT/AU2003/001006 



-42- 

The present invention also contemplates genetic assays such as involving PCR analysis to 
detect RNA expression products of a genetic sequence encoding a mammalian 
transcription factor. The genetic assays may also be able to detect nucleotide 
polymorphisms or other substitutions, additions and/or deletions in the nucleotide sequence 

5 of a mammalian transcription factor. Changes in levels of mammalian transcription factor 
expression such as following mutations in the promoter or regulatory regions or loss of 
mammalian transcription factor activity following mutations in mammalian transcription 
factor nucleotides is proposed to be indicative of a disease condition or a propensity for a 
disease condition to develop. For example, a cell biopsy could be obtained and DNA or 

10 RNA extracted. Alternative methods which may be used alone or in conjunction with other 
methods include direct nucleotide sequencing or mutation scanning such as single stranded 
conformation polymorphoms analysis (SSCP) as well as specific oligonucleotide 
hybridization, denaturing high performance liquid chromatography, first nucleotide change 
(FNC) amongst others. 

15 

The present invention extends to polymorphisms which in the M-grh genes leads to 
healthy or abnormal phenotypes. 

The present invention further contemplates kits to facilitate the rapid detection of 
20 mammalian transcription factors or their fragments in a subject's biological fluid. 

Again, a biological fluid includes a cell extract such as a DNA/RNA extract. 

Still yet another aspect of the present invention contemplates genomic sequences including 
25 gene sequences encoding a mammalian transcription factor as well as regulatory regions 
such as promoters, terminators and transcription/translation enhancer regions associated 
with the gene encoding a mammalian transcription factor. 
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The term "gene" is used in its broadest sense and includes cDNA corresponding to the 
exons of a gene. Accordingly, reference herein to a "gene" is to be taken to include:- 

(i) a classical genomic gene consisting of transcriptional and/or translational 
5 regulatory sequences and/or a coding region and/or non-translated sequences (i.e. 

introns, 5'- and 3 5 - untranslated sequences); or 

(ii) mRNA or cDNA corresponding to the coding regions (i.e. exons) and 5'- and 3'- 
untranslated sequences of the gene. 

10 

The term "gene" is also used to describe synthetic or fusion molecules encoding all or part 
of an expression product. In particular embodiments, the term "nucleic acid molecule" and 
"gene" may be used interchangeably. 

15 In a particularly useful embodiment, the present invention provides a promoter for the 
mammalian transcription factor gene. The identification of the promoter permits 
developmentally-regulated expression of particular genetic sequences. The latter would 
include a range of therapeutic molecules such as cytokines, growth factors, antibiotics or 
other molecules to assist in the treatment of particular disease conditions. 

20 

Accordingly, another aspect of the present invention provides a M-grA-specific promoter 
or functional derivative or homolog thereof, said promoter in situ operably linked to a 
nucleotide sequence comprising any one of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, 
SEQ ID NO:7, SEQ ED NO:9, SEQ ID NO:ll, SEQ ID NO:13 or SEQ ID NO:15 or their 

25 complementary forms or a nucleotide sequence having at least about 60% similarity to 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NOrll, SEQ ID NO:13 or SEQ ID NO:15 or their complementary forms or a nucleotide 
sequence capable of hybridizing to SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID 
NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13 or SEQ ID NO:15 or their 

30 complementary forms under low stringency conditions. 
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The promoter is conveniently resident in a vector which comprises unique restriction sites 
to facilitate the introduction of genetic sequences operably linked to the promoter. 

All such constructs are useful in order to produce recombinant M-GRH molecules and/or 
in gene therapy protocols. 

The present invention further contemplates a genetically modified animal. 

More particularly, the present invention provides an animal model useful for screening for 
agents capable of ameliorating the effects of an aberrant M-GRH or WL-grh gene. In one 
embodiment, the animal model produces low amounts of M-grA. Such an animal would 
have a predisposition for a range of diseases including developmentally regulated diseases. 
The animal model is useful for screening for agents which ameliorate such conditions. 

Accordingly, another aspect of the present invention provides a genetically modified 
animal wherein said animal produces low amounts of M-gr/a relative to a non-genetically 
modified animal of the same species. Reference to "low amounts" includes zero amounts 
or up to about 10% lower than normalized amounts. 

Preferably, the genetically modified animal is a mouse, rat, guinea pig, rabbit, pig, sheep or 
goat. More preferably, the genetically modified animal is a mouse or rat. Most preferably, 
the genetically modified animal is a mouse. 

Accordingly, a preferred aspect of the present invention provides a genetically modified 
mouse wherein said mouse produces low amounts of M-grh relative to a non-genetically 
modified mouse of the same strain. 

The animal model contemplated by the present invention comprises, therefore, an animal 
which is substantially incapable of producing a M-grh. Generally, but not exclusively, such 
an animal is referred to as a homozygous or heterozygous M-gr/i-lcnockout animal. 
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The animal models of the present invention may be in the form of the animals or may be, 
for example, in the form of embryos for transplantation. The embryos are preferably 
maintained in a frozen state and may optionally be sold with instructions for use. 

5 The genetically modified animals may also produce larger amounts of M-GRH For 
example, over expression of normal M-grh or mutant Mrgrh may produce dominant 
negative effects and may become useful disease models. 

Accordingly, another aspect of the present invention is directed to a genetically modified 
10 animal over-expressing genetic sequences encoding M-grh. 

A genetically modified animal includes a transgenic animal, or a "knock-out" or "knock- 
in" animal. 

15 Yet another aspect of the present invention provides a targeting vector usefiil for 
inactivating a gene encoding M-GRH, said targeting vector comprising two segments of 
genetic material encoding said M-GRH flanking a positive selectable marker wherein 
when said targeting vector is transfected into embryonic stem (ES) cells and the marker 
selected, an ES cell is generated in which the gene encoding said M-GDH is inactivated by 

20 homologous recombination. 

Preferably, the ES cells are from mice, rats, guinea pigs, pigs, sheep or goats. Most 
preferably, the ES cells are from mice. 

25 Still yet another aspect of the present invention is directed to the use of a targeting vector 
as defined above in the manufacture of a genetically modified animal substantially 
incapable of producing M-GRH. 

Even still another aspect of the present invention is directed to the use of a targeting vector 
30 as defined above in the manufacture of a genetically modified mouse substantially 
incapable of producing M-GRH. 
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Preferably, the vector is DNA. A selectable marker in the targeting vector allows for 
selection of targeted cells that have stably incorporated the targeting DNA. This is 
especially useful when employing relatively low efficiency transformation techniques such 
5 as electroporation, calcium phosphate precipitation and liposome fusion where typically 
fewer than 1 in 1000 cells will have stably incorporated the exogenous DNA. Using high 
efficiency methods, such as microinjection into nuclei, typically from 5-25% of the cells 
will have incorporated the targeting DNA; and it is, therefore, feasible to screen the 
targeted cells directly without the necessity of first selecting for stable integration of a 
10 selectable marker. Either isogenic or non-isogenic DNA may be employed. 

Examples of selectable markers include genes conferring resistance to compounds such as 
antibiotics, genes conferring the ability to grow on selected substrates, genes encoding 
proteins that produce detectable signals such as luminescence. A wide variety of such 

15 markers are known and available, including, for example, antibiotic resistance genes such 
as the neomycin resistance gene (iieo) and the hygromycin resistance gene (hyg). 
Selectable markers also include genes conferring the ability to grow on certain media 
substrates such as the tk gene (thymidine kinase) or the hprt gene (hypoxanthine 
phosphoribosyltransferase) which confer the ability to grow on HAT medium 

20 (hypoxanthine, aminopterin and thymidine); and the bacterial gpt gene (guanine/xanthine 
phosphoribosyltransferase) which allows growth on MAX medium (mycophenolic acid, 
adenine and xanthine). Other selectable markers for use in mammalian cells and plasmids 
carrying a variety of selectable markers are described in Sambrook et al, Molecular 
Cloning - A Laboratory Manual, Cold Spring Harbour, New York, USA, 1990. 

25 

The preferred location of the marker gene in the targeting construct will depend on the aim 
of the gene targeting. For example, if the aim is to disrupt target gene expression, then the 
selectable marker can be cloned into targeting DNA corresponding to coding sequence in 
the target DNA. Alternatively, if the aim is to express an altered product from the target 
30 gene, such as a protein with an amino acid substitution, then the coding sequence can be 



WO 2004/015108 



>CT/AU2003/001006 



-47- 

modified to code for the substitution, and the selectable marker can be placed outside of 
the coding region, for example, in a nearby intron. 

The selectable marker may depend on its own promoter for expression and the marker 
5 gene may be derived from a very different organism than the organism being targeted (e.g. 
prokaryotic marker genes used in targeting mammalian cells). However, it is preferable to 
replace the original promoter with transcriptional machinery known to function in the 
recipient cells. A large number of transcriptional initiation regions are available for such 
purposes including, for example, metallothionein promoters, thymidine kinase promoters, 
10 jS-actin promoters, immunoglobulin promoters, SV40 promoters and human 
cytomegalovirus promoters. A widely used example is the pSV2-neo plasmid which has 
the bacterial neomycin phosphotransferase gene under control of the SV40 early promoter 
and confers in mammalian cells resistance to G418 (an antibiotic related to neomycin). A 
number of other variations may be employed to. enhance expression of the selectable 
15 markers in animal cells, such as the addition of a poly(A) sequence and the addition of 
synthetic translation initiation sequences. Both constitutive and inducible promoters may 
be used. 

The DNA is preferably modified by homologous recombination. The target DNA can be in 
20 any organelle of the animal cell including the nucleus and mitochondria and can be an 
intact gene, an exon or intron, a regulatory sequence or any region between genes. 

Homologous DNA is a DNA sequence that is at least 70% identical with a reference DNA 
sequence. An indication that two sequences are homologous is that they will hybridize 
25 with each other under stringent conditions (Sambrook et al, 1990, supra). 

The genetically modified animals contemplated herein include "knock out" or "knock in" 
animals or genetic sequencing carrying one or more nucleotide additions, deletions, 
substitutions and/or insertions. They are useful in a range of applications including the 
30 development of medical assessment systems such as to monitor particle physiological 
conditions including genetic defects such as but not limited to spinabifida in humans. The 
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medical assessment systems are also useful as a model for wound healing and clsoure and 
for agents which modulate same. 

The present invention further contemplates conditional genetically modified animals, such 
5 as those produced using recombination methods that are standard in the art. Bacteriophage 
PI Cre recombinase and flp recombinase from yeast plasmids are two non-limiting 
examples of site-specific DNA recombinase enzymes that leave DNA at specific target 
sites (box P sites for Cre recombinase and fit sites for flp recombinase). 

10 The present invention further contemplates co-suppression (i.e. sense suppression) and 
antisense suppression to down-regulate expression of Mrgrh This would generally occur in 
a target test animal such as to generate a disease model. 

In addition to providing a diagnostic capability as described above, the isolated nucleic 
15 acid molecules of the present invention may also provide a therapeutic capability by being 
used to correct or complement an abnormality detected in a subject. To deliver the 
appropriate sequence to a recipient cell or tissue of a subject, an isolated nucleic acid 
molecule of the present invention may be cloned into a suitable genetic construct such as a 
suitable vector. 

20 

Accordingly, a further aspect of the present invention contemplates a genetic construct 
comprising a nucleotide sequence encoding an M-grh selected from SEQ ID NO:l, SEQ 
ID NO:3, SEQ ED NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13 or 
SEQ ID NO: 15 or a variant thereof or a nucleotide sequence having at least 60% similarity 
25 to one or more of SEQ ID NO:l, SEQ ED NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ED 
NO:9, SEQ ID NO:ll, SEQ ID NO: 13 or SEQ ED NO: 15 or a variant thereof or a 
nucleotide sequence capable of hybridizing to SEQ ID NO:l, SEQ ED NO:3, SEQ ED 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13 or SEQ ED NO:15 
under low stringency conditions or a variant thereof or a complementary form thereof. 

30 

A "vector" is a polynucleotide molecule, preferably a DNA molecule derived, for example, 
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from a plasmid, bacteriophage, or plant virus, into which a polynucleotide can be inserted 
or cloned. A vector preferably contains one or more unique restriction sites and can be 
capable of autonomous replication in a defined host cell including a target cell or tissue or 
a progenitor cell or tissue thereof, or be integrable with the genome of the defined host 

5 such that the cloned sequence is reproducible. Accordingly, the vector may be an 
autonomously replicating vector, i.e. a vector that exists as an extra-chromosomal entity, 
the replication of which is independent of chromosomal replication. Examples include a 
linear or closed circular plasmid, an extra-chromosomal element, a mini-chromosome, or 
an artificial chromosome. The vector may also contain a means for assuring self- 

10 replication. Alternatively, the vector may be one which, when introduced into the host cell, 
is integrated into the genome and replicated together with the chromosome(s) into which it 
has been integrated. A vector system may comprise a single vector or plasmid, two or more 
vectors or plasmids, which together contain the total DNA to be introduced into the 
genome of the host cell, or a transposon. 

15 

Vectors suitable for gene therapy applications are well known in the art. The choice of the 
vector will typically depend on the compatibility of the vector with the host cell into which 
it is to be introduced. The vector may also include an additional genetic construct 
comprising a selection marker such as an antibiotic resistance gene that can be used for 
20 selection of suitable transformants. Examples of such resistance genes are known to those 
skilled in the art and include the nptll gene that confers resistance to the antibiotics 
kanamycin, and G418 (Geneticin®) and the hph gene which confer resistance to the 
antibiotic hygromycin B. 

25 Accordingly, in a related embodiment, the present invention provides a genetic construct 
comprising a promoter or functional equivalent thereof operably linked to a nucleotide 
sequence of the invention. 

Reference herein to a "promoter" is to be taken in its broadest context and includes the 
30 transcriptional regulatory sequences of a classical genomic gene, which is required for 
accurate transcription initiation, with or without a CCAAT box sequence and additional 
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regulatory elements (i.e. upstream activating sequences, enhancers and silencers), which 
alter gene expression in response to developmental and/or external stimuli, or in a tissue- 
specific manner. A promoter is usually, but not necessarily, positioned upstream (5') of a 
gene region, the expression of which it regulates. Furthermore, the regulatory elements 
comprising a promoter are usually positioned within 2 kb of the start site of transcription of 
the gene. As is known in the art, some variation in this distance can be accommodated 
without loss of promoter function. 

The selection of an appropriate promoter sequence to regulate expression of a transcription 
factor encoded by an isolated nucleic acid molecule of the present invention is an 
important consideration. Examples of suitable promoters include viral, fungal, bacterial, 
animal and plant derived promoters capable of functioning in eukaryotic animal cells and, 
especially, human cells. The promoter may regulate the expression of the nucleic acid 
molecule differentially with respect to the cell, tissue or organ in which expression occurs, 
or with respect to the developmental stage at which expression occurs. 

Preferably, the promoter is capable of regulating expression of a nucleic acid molecule in a 
eukaryotic cell, tissue or organ, at least during the period of time over which the regulated 
gene is expressed therein, and more preferably also immediately preceding the 
commencement of detectable expression of the regulated gene in said cell, tissue or organ. 

Particularly preferred promoters for use with the nucleic acid molecules of the present 
invention include the bacteriophage T7 promoter, bacteriophage T3 promoter, SP6 
promoter, lac operator-promoter, tac promoter, SV40 late promoter, SV40 early promoter, 
RSV-LTR promoter, CMV IE promoter, CaMV 35S promoter, SCSV promoter, SCBV 
promoter and the like. Those skilled in the art will readily be aware of additional promoter 
sequences other than those specifically described. 

In the present context, the terms "in operable connection with" or "operably linked" or 
similar shall be taken to indicate that expression of the nucleic acid molecule is under the 
control of the promoter sequence, with which it is spatially connected, in a cell, tissue, 
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organ or whole organism. 

The genetic construct of the present invention may also comprise a 3' non-translated 
sequence. A 3' non-translated sequence refers to that portion of a gene comprising a DNA 
5 segment that contains a polyadenylation signal and any other regulatory signals capable of 
effecting mRNA processing or gene expression. The polyadenylation signal is 
characterized by effecting the addition of polyadenylic acid tracts to the 3' end of the 
mRNA precursor. Polyadenylation signals are commonly recognized by the presence of 
homology to the canonical form 5' AATAAA-3' although variations are not uncommon. 

10 

Accordingly, a genetic construct comprising a nucleic acid molecule of the present 
invention, operably linked to a promoter, may be cloned into a suitable vector for delivery 
to a cell or tissue in which regulation is faulty, malfunctioning or non-existent, in order to 
rectify and/or provide the appropriate regulation. Vectors comprising appropriate genetic 
15 constructs may be delivered into target eukaryotic cells by a number of different means 
well known to those skilled in the art of molecular biology. 

The present invention further contemplates the use of an M-GRH or M-grh in the 
manufacture of a medicament for the treatment of a disease condition in a mammal such as 
20 a human. 

The present invention is further directed to promoters and 3'- and 5'-regulatory regions 
associated with genomic forms of M-grh genes. These regions can be readily identified by, 
for example, chromosome walking using M-grh nucleic acid molecules or probes or 
25 primers therefrom. 

A further aspect of the present invention relates to the use of the invention in relation to the 
treatment and/or prophylaxis of disease conditions. Without limiting the present invention 
to any one theory or mode of action, the broad range of cellular functional activities which 
30 are regulated by transcription factors renders the regulation of transcription factor function 
an integral component of every aspect of both healthy and disease state physiological 
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processes. Accordingly, the method of the present invention provides a valuable tool for 
modulating aberrant or otherwise unwanted cellular functional activity which is regulated 
via transcription factors. 

Accordingly, another aspect of the present invention is directed to a method for the 
treatment and/or prophylaxis of a condition in a subject, which condition is characterised 
by aberrant, unwanted or otherwise inappropriate cellular activity, said method comprising 
administering to said mammal an effective amount of an agent for a time and under 
conditions sufficient to modulate transcription factor function. 

The terms "agent", "compound", "active agent", "pharmacologically active agent", 
"medicament", "active" and "drug" are used interchangeably herein to refer to a chemical 
compound that induces a desired pharmacological and/or physiological effect. The terms 
also encompass pharmaceutically acceptable and pharmacologically active ingredients of 
those active agents specifically mentioned herein including but not limited to salts, esters, 
amides, prodrugs, active metabolites, analogs and the like. When the terms "agent", 
"compound", "active agent", "pharmacologically active agent", "medicament", "active" 
and "drug" are used, then it is to be understood that this includes the active agent per se as 
well as pharmaceutically acceptable, pharmacologically active salts, esters, amides, 
prodrugs, metabolites, analogs, etc. The term "compound" is not to be construed as a 
chemical compound only but extends to peptides, polypeptides and proteins as well as 
genetic molecules such as RNA, DNA and chemical analogs thereof as well as RNAi- or 
siRNA-type molecules or complexes comprising same. In accordance with the previous 
aspects of the present invention, the agent preferably comprises a transcription factor or 
genetic molecules encoding same or derivative, analogue, chemical equivalent or mimetic 
thereof 

"Subject" as used herein refers to an animal, preferably a mammal and more preferably 
human who can benefit from the pharmaceutical formulations and methods of the present 
invention. There is no limitation on the type of animal that could benefit from the presently 
described pharmaceutical formulations and methods. A patient regardless of whether a 
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human or non-human animal may be referred to as an individual, subject, animal, host or 
recipient. 

The preferred animals are humans or other primates, livestock animals, laboratory test 
5 animals, companion animals or captive wild animals. 

Examples of laboratory test animals include mice, rats, rabbits, guinea pigs and hamsters. 
Rabbits and rodent animals, such as rats and mice, provide a convenient test system or 
animal model. Livestock animals include sheep, cows, pigs, goats, horses and donkeys. 
10 Non-mammalian animals such as zebrafish and amphibians (including cane toads) are also 
contemplated 

An "effective amount' 1 means an amount necessary at least partly to attain the desired 
response, or to delay the onset or inhibit progression or halt altogether, the onset or 

15 progression of a particular condition being treated. The amount varies depending upon the 
health and physical condition of the individual to be treated, the taxonomic group of 
individual to be treated, the degree of protection desired, the formulation of the 
composition, the assessment of the medical situation, and other relevant factors. It is 
expected that the amount will fall in a relatively broad range that can be determined 

20 through routine trials. 

Reference herein to "treatment" and "prophylaxis" is to be considered in its broadest 
context. The term "treatment" does not necessarily imply that a subject is treated until total 
recovery. Similarly, "prophylaxis" does not necessarily mean that the subject will not 
25 eventually contract a disease condition. Accordingly, treatment and prophylaxis include 
amelioration of the symptoms of a particular condition or preventing or otherwise reducing 
the risk of developing a particular condition. The term "prophylaxis" may be considered as 
reducing the severity or onset of a particular condition. "Treatment" may also reduce the 
severity of an existing condition. 
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The present invention further contemplates a combination of therapies, such as the 
administration of the agent together with subjection of the mammal to other agents, drugs 
or treatments which may be useful in relation to the treatment of the subject condition such 
as spina bifida and anencephaly. 

5 

Administration of the modulatory agent, in the form of a pharmaceutical composition, may 
be performed by any convenient means. The modulatory agent of the pharmaceutical 
composition is contemplated to exhibit therapeutic activity when administered in an 
amount which depends on the particular case. The variation depends, for example, on the 

10 human or animal and the modulatory agent chosen. A broad range of doses may be 
applicable. Considering a patient, for example, from about O.lmg, 0.2mg, 0.3mg, 0.4mg, 
0.5mg, 0.6mg, 0.7mg, 0.8mg. 0.9mg to about 1 mg of modulatory agent may be 
administered per kilogram of body weight per day. Dosage regimes may be adjusted to 
provide the optimum therapeutic response. For example, several divided doses may be 

15 administered daily, weekly, monthly or other suitable time intervals or the dose may be 
proportionally reduced as indicated by the exigencies of the situation. 

The modulatory agent may be administered in a convenient manner such as by the oral, 
intravenous (where water soluble), intraperitoneal, intramuscular, subcutaneous, 

20 intradermal or suppository routes or implanting (e.g. using slow release molecules). The 
modulatory agent may be administered in the form of pharmaceutically acceptable 
nontoxic salts, such as acid addition salts or metal complexes, e.g. with zinc, iron or the 
like (which are considered as salts for purposes of this application). Illustrative of such 
acid addition salts are hydrochloride, hydrobromide, sulphate, phosphate, maleate, acetate, 

25 citrate, benzoate, succinate, malate, ascorbate, tartrate and the like. If the active ingredient 
is to be administered in tablet form, the tablet may contain a binder such as tragacanth, 
corn starch or gelatin; a disintegrating agent, such as alginic acid; and a lubricant, such as 
magnesium stearate. 

30 Routes of administration include, but are not limited to, respiratorally, intratracheally, 
nasopharyngeal^, intravenously, intraperitoneally, subcutaneously, intracranially, 
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intradermally, intramuscularly, intraoccularly, intrathecally, intracereberally, intranasally, 
infusion, orally, rectally, via IV drip patch and implant. 

In accordance with these methods, the agent defined in accordance with the present 
5 invention may be coadministered with one or more other compounds or molecules. By 
"coadministered" is meant simultaneous administration in the same formulation or in two 
different formulations via the same or different routes or sequential administration by the 
same or different routes. For example, the subject agent may be administered together 
with an agonistic agent in order to enhance its effects. By "sequential" administration is 
10 meant a time difference of from seconds, minutes, hours or days between the 
administration of the two types of molecules. These molecules may be administered in any 
order. 

Another aspect of the present invention contemplates the use of an agent, as hereinbefore 
15 defined, in the manufacture of medicament for the treatment of a condition in a subject, 
which condition is characterised by aberrant, unwanted or otherwise inappropriate cellular 
activity, wherein said agent modulates transcription factor function. 

In yet another further aspect, the present invention contemplates a pharmaceutical 
20 composition comprising the modulatory agent as hereinbefore defined together with one or 
more pharmaceutically acceptable carriers and/or diluents. These agents are referred to as 
the active ingredients. 

The pharmaceutical forms suitable for injectable use include sterile aqueous solutions 
25 (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersion or may be in the form of a cream or 
other form suitable for topical application. It must be stable under the conditions of 
manufacture and storage and must be preserved against the contaminating action of 
microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion 
30 medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene 
glycol and liquid polyethylene glycol, and the like), suitable mixtures thereof, and 
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vegetable oils. The proper fluidity can be maintained, for example, by the use of a coating 
such as lecithin, by the maintenance of the required particle size in the case of dispersion 
and by the use of superfactants. The preventions of the action of microorganisms can be 
brought about by various antibacterial and antifungal agents, for example, parabens, 
chlorobutanol, phenol, sorbic acid, tbimerosal and the like. In many cases, it will be 
preferable to include isotonic agents, for example, sugars or sodium chloride. Prolonged 
absorption of the injectable compositions can be brought about by the use in the 
compositions of agents delaying absorption, for example, aluminum monostearate and 
gelatin. 



10 



Sterile injectable solutions are prepared by incorporating the active compounds in the 
required amount in the appropriate solvent with various of the other ingredients 
enumerated above, as required, followed by filtered sterilisation. Generally, dispersions 
are prepared by incorporating the various sterilised active ingredient into a sterile vehicle 
15 which contains the basic dispersion medium and the required other ingredients from those 
enumerated above. In the case of sterile powders for the preparation of sterile injectable 
solutions, the preferred methods of preparation are vacuum drying and the freeze-drying 
technique which yield a powder of the active ingredient plus any additional desired 
ingredient from previously sterile-filtered solution thereof. 



20 



When the active ingredients are suitably protected they may be orally administered, for 
example, with an inert diluent or with an assimilable edible carrier, or it may be enclosed 
in hard or soft shell gelatin capsule, or it may be compressed into tablets, or it may be 
incorporated directly with the food of the diet. For oral therapeutic administration, the 
25 active compound may be incorporated with excipients and used in the form of ingestible 
tablets, buccal tablets, troches, capsules, elixirs, suspensions, syrups, wafers, and the like. 
Such compositions and preparations should contain at least 1% by weight of active 
compound. The percentage of the compositions and preparations may, of course, be varied 
and may conveniently be between about 5 to about 80% of the weight of the unit. The 
30 amount of active compound in such therapeutically useful compositions in such that a 
suitable dosage will be obtained. Preferred compositions or preparations according to the 
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present invention are prepared so that an oral dosage unit form contains between about 0.1 
Hg and 2000 mg of active compound. 

The tablets, troches, pills, capsules and the like may also contain the components as listed 
5 hereafter: a binder such as gum, acacia, com starch or gelatin; excipients such as dicalcium 
phosphate; a disintegrating agent such as com starch, potato starch, alginic acid and the 
like; a lubricant such as magnesium stearate; and a sweetening agent such as sucrose, 
lactose or saccharin may be added or a flavouring agent such as peppermint, oil of 
wintergreen, or cherry flavouring. When the dosage unit form is a capsule, it may contain, 

10 in addition to materials of the above type, a liquid carrier. Various other materials may be 
present as coatings or to otherwise modify the physical form of the dosage unit. For 
instance, tablets, pills, or capsules may be coated with shellac, sugar or both. A syrup or 
elixir may contain the active compound, sucrose as a sweetening agent, methyl and 
propylparabens as preservatives, a dye and flavouring such as cherry or orange flavour. Of 

15 course, any material used in preparing any dosage unit form should be pharmaceutically 
pure and substantially non-toxic in the amounts employed. In addition, the active 
compound(s) may be incorporated into sustained-release preparations and formulations. 

Antisense polynucleotide sequences are another useful example of a therapeutic agent 
20 which can prevent or diminish the expression of the transcription factor genetic sequences, 
as will be appreciated by those skilled in the art. Polynucleotide vectors, for example, 
containing all or a portion of the M-grh sequences or other sequences from an M-grh 
region (particularly those flanking an M-grh gene locus) may be placed under the control 
of a promoter in an antisense orientation and introduced into a cell. Expression of such an 
25 antisense construct within a cell will interfere with gene transcription and/or translation. 
Furthermore, co-suppression and mechanisms to induce RNAi (i.e. siRNA) may also be 
employed. Such techniques may be sueful to inhibit genes which positively promote 
M-grh gene expression. Alternatively, antisense or sense molecules may be directly 
administered. In this latter embodiment, the antisense or sense molecules may be 
30 formulated in a composition and then administered by any number of means to target cells. 
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A variation on antisense and sense molecules involves the use of morpholinos, which are 
oligonucleotides composed of morpholine nucleotide derivatives and phosphorodiamidate 
linkages (for example, Summerton and Weller, linkages (for example, Summerton and 
Weller, Antisense and Nucleic Acid Drug Development 7: 187-195, 1997). Such 
5 compounds are injected into embryos and the effect of interference with mRNA is 
observed. 

In one embodiment, the present invention employs compounds such as oligonucleotides 
and similar species for use in modulating the function or effect of nucleic acid molecules 

10 encoding an M-GRH transcription factor, i.e. the oligonucleotides induce transcriptional or 
post-transcriptional gene silencing. This is accomplished by providing oligonucleotides 
which specifically hybridize with one or more nucleic acid molecules encoding the 
transcription factor. As used herein, the terms "target nucleic acid" and "nucleic acid 
molecule encoding a transcription factor" have been used for convenience to encompass 

15 DNA encoding M-GRH, RNA (including pre-mRNA and mRNA or portions thereof) 
transcribed from such DNA, and also cDNA derived from such RNA. The hybridization of 
a compound of the subject invention with its target nucleic acid is generally referred to as 
"antisense". Consequently, the preferred mechanism believed to be included in the practice 
of some preferred embodiments of the invention is referred to herein as "antisense 

20 inhibition." Such antisense inhibition is typically based upon hydrogen bonding-based 
hybridization of oligonucleotide strands or segments such that at least one strand or 
segment is cleaved, degraded, or otherwise rendered inoperable. In this regard, it is 
presently preferred to target specific nucleic acid molecules and their functions for such 
antisense inhibition. 

25 

The functions of DNA to be interfered with can include replication and transcription. 
Replication and transcription, for example, can be from an endogenous cellular template, a 
vector, a plasmid construct or otherwise. The functions of RNA to be interfered with can 
include functions such as translocation of the RNA to a site of protein translation, 
30 translocation of the RNA to sites within the cell which are distant from the site of RNA 
synthesis, translation of protein from the RNA, splicing of the RNA to yield one or more 
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RNA species, and catalytic activity or complex formation involving the RNA which may 
be engaged in or facilitated by the RNA. One preferred result of such interference with 
target nucleic acid function is modulation of the expression of a M-gr/z gene. In the context 
of the present invention, "modulation" and "modulation of expression" mean either an 
5 increase (stimulation) or a decrease (inhibition) in the amount or levels of a nucleic acid 
molecule encoding the gene, e.g., DNA or RNA. Inhibition is often the preferred form of 
modulation of expression and mRNA is often a preferred target nucleic acid. 

In the context of this invention, "hybridization" means the pairing of complementary 
10 strands of oligomeric compounds. In the present invention, the preferred mechanism of 
pairing involves hydrogen bonding, which may be Watson-Crick, Hoogsteen or reversed 
Hoogsteen hydrogen bonding, between complementary nucleoside or nucleotide bases 
(nucleobases) of the strands of oligomeric compounds. For example, adenine and thymine 
are complementary nucleobases which pair through the formation of hydrogen bonds. 
1 5 Hybridization can occur under varying circumstances. 

An antisense compound is specifically hybridizable when binding of the compound to the 
target nucleic acid interferes with the normal function of the target nucleic acid to cause a 
loss of activity, and there is a sufficient degree of complementarity to avoid non-specific 
20 binding of the antisense compound to non-target nucleic acid sequences under conditions 
in which specific binding is desired, i.e. under physiological conditions in the case of in 
vivo assays or therapeutic treatment, and under conditions in which assays are performed 
in the case of in vitro assays. 

25 "Complementary" as used herein, refers to the capacity for precise pairing between two 
nucleobases of an oligomeric compound. For example, if a nucleobase at a certain position 
of an oligonucleotide (an oligomeric compound), is capable of hydrogen bonding with a 
nucleobase at a certain position of a target nucleic acid, said target nucleic acid being a 
DNA, RNA, or oligonucleotide molecule, then the position of hydrogen bonding between 

30 the oligonucleotide and the target nucleic acid is considered to be a complementary 
position. The oligonucleotide and the further DNA, RNA, or oligonucleotide molecule are 
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complementary to each other when a sufficient number of complementary positions in 
each molecule are occupied by nucleobases which can hydrogen bond with each other. 
Thus, "specifically hybridizable" and "complementary" are terms which are used to 
indicate a sufficient degree of precise pairing or complementarity over a sufficient number 
5 of nucleobases such that stable and specific binding occurs between the oligonucleotide 
and a target nucleic acid. 

According to the present invention, compounds include antisense oligomeric compounds, 
antisense oligonucleotides, ribozymes, external guide sequence (EGS) oligonucleotides, 

10 alternate splicers, primers, probes, and other oligomeric compounds which hybridize to at 
least a portion of the target nucleic acid. As such, these compounds may be introduced in 
the form of single-stranded, double-stranded, circular or hairpin oligomeric compounds 
and may contain structural elements such as internal or terminal bulges or loops. Once 
introduced to a system, the compounds of the invention may elicit the action of one or 

15 more enzymes or structural proteins to effect modification of the target nucleic acid. One 
non-limiting example of such an enzyme is RNAse H, a cellular endonuclease which 
cleaves the RNA strand of an RNA:DNA duplex. It is known in the art that single-stranded 
antisense compounds which are "DNA-like" elicit RNAse H. Activation of RNase H, 
therefore, results in cleavage of the RNA target, thereby greatly enhancing the efficiency 

20 of oligonucleotide-mediated inhibition of gene expression. Similar roles have been 
postulated for other ribonucleases such as those in the RNase EI and ribonuclease L family 
of enzymes. 

While the preferred form of antisense compound is a single-stranded antisense 
25 oligonucleotide, in many species the introduction of double-stranded structures, such as 
double-stranded RNA (dsRNA) molecules, has been shown to induce potent and specific 
antisense-mediated reduction of the function of a gene or its associated gene products. This 
phenomenon occurs in both plants and animals. 

30 In the context of the subject invention, the term "oligomeric compound" refers to a 
polymer or oligomer comprising a plurality of monomeric units. In the context of this 
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invention, the term "oligonucleotide" refers to an oligomer or polymer of ribonucleic acid 
(RNA) or deoxyribonucleic acid (DNA) or mimetics, chimeras, analogs and homologs 
thereof. This term includes oligonucleotides composed of naturally occurring nucleobases, 
sugars and covalent internucleoside (backbone) linkages as well as oligonucleotides having 
5 non-naturally occurring portions which function similarly. Such modified or substituted 
oligonucleotides are often preferred over native forms because of desirable properties such 
as, for example, enhanced cellular uptake, enhanced affinity for a target nucleic acid and 
increased stability in the presence of nucleases. 

10 While oligonucleotides are a preferred form of the compounds of this invention, the 
present invention comprehends other families of compounds as well, including but not 
limited to oligonucleotide analogs and mimetics such as those described herein. 

The compounds in accordance with this invention preferably comprise from about 8 to 
15 about 80 nucleobases (i.e. from about 8 to about 80 linked nucleosides). One of ordinary 
skill in the art will appreciate that the invention embodies compounds of 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, or 80 nucleobases 
20 in length. 

As is known in the art, a nucleoside is a base-sugar combination. The base portion of the 
nucleoside is normally a heterocyclic base. The two most common classes of such 
heterocyclic bases are the purines and the pyrimidines. Nucleotides are nucleosides that 

25 further include a phosphate group covalently linked to the sugar portion of the nucleoside. 
For those nucleosides that include a pentofuranosyl sugar, the phosphate group can be 
linked to either the 2\ 3' or 5' hydroxyl moiety of the sugar. In forming oligonucleotides, 
the phosphate groups covalently link adjacent nucleosides to one another to form a linear 
polymeric compound. In turn, the respective ends of this linear polymeric compound can 

30 be further joined to form a circular compound, however, linear compounds are generally 
preferred. In addition, linear compounds may have internal nucleobase complementarity 
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and may therefore fold in a manner as to produce a fully or partially double-stranded 
compound. Within oligonucleotides, the phosphate groups are commonly referred to as 
forming the intemucleoside backbone of the oligonucleotide. The normal linkage or 
backbone of RNA and DNA is a 3* to 5' phosphodiester linkage. 

Specific examples of preferred antisense compounds useful in this invention include 
oligonucleotides containing modified backbones or non-natural intemucleoside linkages. 
As defined in this specification, oligonucleotides having modified backbones include those 
that retain a phosphorus atom in the backbone and those that do not have a phosphorus 
atom in the backbone. For the purposes of this specification, and as sometimes referenced 
in the art, modified oligonucleotides that do not have a phosphorus atom in their 
intemucleoside backbone can also be considered to be oligonucleosides. 

Preferred modified oligonucleotide backbones containing a phosphorus atom therein 
include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, 
phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates 
including 3*-alkylene phosphonates, 5 f -alkylene phosphonates and chiral phosphonates, 
phosphinates, phosphoramidates including 3' -amino phosphor amidate and 
aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, 
thionoalkylphosphotriesters, selenophosphates and boranophosphates having normal 3 '-5' 
linkages, 2 '-5 9 linked analogs of these, and those having inverted polarity wherein one or 
more internucleotide linkages is a 3' to 3\ 5' to 5' or 2' to 2' linkage. Preferred 
oligonucleotides having inverted polarity comprise a single 3* to 3' linkage at the 3 '-most 
internucleotide linkage i.e. a single inverted nucleoside residue which may be abasic (the 
nucleobase is missing or has a hydroxyl group in place thereof). Various salts, mixed salts 
and free acid forms are also included. 

Many of the preferred features described above are appropriate for sense nucleic acid 
molecules. 

Another aspect of the present invention contemplates a method for the treatment or 
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prophylaxis of an animal, said method comprising exposing stem cells in said animal to 
one or more agents comprising a transcription factor, or a genetic molecule encoding a 
transcription factor or derivative, analogue, chemical equivalent or mimetic thereof which 
facilitates the proliferation and/or differentiation and/or self-renewal of stem cells to 
5 facilitate repair, replacement or augmentation of particular tissue. 

In a related embodiment, the present invention provides a method for the treatment or 
prophylaxis of an animal, said method comprising exposing stem cells and mature cells or 
cells developmental^ in between in said animal to one or more agents which facilitate the 
10 proliferation and/or differentiation and/or self-renewal of stem cells and mature cells or of 
cells developmentally in between to facilitate repair, replacement or augmentation of 
particular tissue. 

As indicated above, the term "animal" includes a human amongst a range of other animals 
15 including avian species. The agent may comprise a single molecule or a combination of 
two or more molecules in a synergistic combination, admixture or cocktail. When in 
combination, the agents may be administered or used simultaneously or used sequentially 
such as seconds, minutes, hours, days or weeks apart. As indicated further below, the agent 
including therapeutic agent of this aspect of the present invention may also be a multi-part 
20 pharmaceutical pack or composition with instructions for use. 

Reference to "exposing" to stem cells includes the situation where an agent is introduced 
into the body of the animal or the agent is contacted to an internal or external surface of, 
for example, skin or an organ or is otherwise administered to a surface or sub-surface or 

25 internal region. Alternatively, stem cells are removed from the animal's body and exposed 
to the agent ex vivo to facilitate differentiation and/or proliferation and/or self-renewal 
(either ex vivo or in vivo) and then the cells are returned to the same or different individual. 
According to this aspect of the present invention, although it is preferred to administer a 
therapeutic agent to an animal subject (e.g. a human), part of the therapeutic protocol may 

30 occur ex vivo. For example, proliferation may occur in vivo and differentiation may occur 
in vivo. Alternatively, part proliferation and part differentiation may occur ex vivo and 
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further facilitated in vivo by the administration of a therapeutic agent. Yet in a further 
alternative, proliferation occurs ex vivo and partial differentiation occurs in vitro but 
complete differentiation occurs in vivo. The terms "ex vivo" and "in vitro" are used 
interchangedly in this specification. In some circumstances, stem cells maintained in vitro 
5 may be used. Alternatively, ex vivo cells may first be genetically modified prior to re- 
introduction into a subject's body or a compatible counterpart. 

Reference to "in vitro" or "ex vivo" means in tissue culture or in any situation outside the 
animal body. The term "in vivo" also means in situ and means treatment inside an animal 
10 body. 

A mature cell in this context also includes a committed cell. This aspect of the present 
invention also extends to fetal cells such as ES cells or EG cells. 

15 The entire repertoire of stem cells may be targeted by the therapeutic agent or one or more 
sub-populations may be induced to proliferate and/or differentiate. This is the difference 
between a generic agent or a specific agent. Cell sub-populations contemplated by the 
present invention include cells from the brain (e.g. adult neural stem cells, neurons, 
astrocytes), epidermis (e.g. keratinocyte stem cells, keratinocyte transient amplifying cells, 
20 keratinocyte post-mitotic differentiating cells, melanocyte stem cells, melanocytes), 
embryos (e.g. ES or EG cells), skin (e.g. foreskin fibroblasts), pancreas (e.g. pancreatic 
islet cells, pancreatic P cells), kidney (e.g. adult renal stem cells, embryonic renal epithelial 
stem cells, kidney epithelial cells), liver (e.g. hepatocytes, bile duct epithelial cells, 
embryonic endodermal stem cells, adult hepatocyte stem cells), breast (e.g. mammary 
25 epithelial stem cells), lung (e.g. bone marrow-derived stem cells, lung fibroblasts, 
bronchial epithelial cells, alveolar type II pneumocytes), muscle (e.g. skeletal muscle stem 
cells [satellite cells]), heart (e.g. cardiomyoctes, bone marrow mesenchymal stem cells), 
eye (e.g. limbal stem cells, corneal epithelial cells), bone (e.g. mesenchymal stem cells, 
osteoblasts [precursor of mesenchymal stem cells], peripheral blood mononuclear 
30 progenitor cells [hematopoietic stem cells], osteoclasts), spleen (e.g. splenocytes) and cells 
from the immune system (e.g. CD34 + stem cells, CDllc + cells, CDllc cells, CD4 + T- 
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cells, CD8 + T-cells, NK cells, monocytes, macrophages, dendritic cells and p-cells. 

Whilst some of the above-listed cells are "mature'* cells, they nevertheless may participate 
in a repair, regeneration or augmentation process by being selectively proliferated or used 
5 to "hone" in on particular tissue in need of treatment. Accordingly, the present invention is 
not to be interpreted as excluding the participation of mature cell types in the repair, 
regeneration and/or augmentation process as well as any other cell at a developmental 
stage between an ES cell and a mature cell. 

10 Accordingly, another aspect of the present invention contemplates a method for tissue 
repair, regeneration and/or augmentation in an animal, said method comprising 
administering to said animal an agent or a combination of two or more agents, which 
agents promote or otherwise facilitate the proliferation and/or differentiation and/or self- 
renewal of a cell type selected from the listing comprising adult neural stem cells, neurons, 

15 astrocytes, keratinocyte stem cells, keratinocyte transient amplifying cells, keratinocyte 
post-mitotic differentiating cells, melanocyte stem cells, melanocytes, embryonic stemc 
ells, embryonic germ cells, foreskin fibroblasts, pancreatic islet cells, pancreatic p-cells, 
adult renal stem cells, embryonic renal epithelial stem cells, kidney epithelial cells, 
hepatocytes, bile duct epithelial cells, embryonic endodermal stem cells, adult hepatocyte 

20 stem cells, mammary epithelial stem cells, bone marrow-derived stem cells, lung 
fibroblasts, bronchial epithelial cells, alveolar type II pneumocytes, skeletal muscle stem 
cells [satellite cells], cardiomyoctes, bone marrow mesenchymal stem cells, limbal stem 
cells, corneal epithelial cells, mesenchymal stem cells, osteoblasts [precursor of 
mesenchymal stem cells], peripheral blood mononuclear progenitor cells [hematopoietic 

25 stem cells], osteoclasts or splenocytes, said agents being administered for a time and under 
conditions sufficient to promote tissue repair, augmentation and/or regeneration. 

Those skilled in the art will appreciate that the invention described herein is susceptible to 
variations and modifications other than those specifically described. It is to be understood 
30 that the invention includes all such variations and modifications. The invention also 
includes all of the steps, features, compositions and compounds referred to or indicated in 
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this specification, individually or collectively, and any and all combinations of any two or 
more of said steps or features. It is also to be understood that unless stated otherwise, the 
subject invention is not limited to specific formulation components, manufacturing 
methods, dosage regimes, or the like, as such may vary. 

The present invention is further described by the following non-limiting Examples. 

EXAMPLE 1 
Polymerase chain reaction 

For RT-PCR , first strand cDNA was prepared from 2 jxg of mRNA from primary tissues 
using random hexamers. Each cDNA sample was appropriately diluted to give similar 
amplification of S14 RNA under the same PCR conditions. The primer sequences are 
detailed below. The PCR conditions were 94°C for 2 min followed by 35 cycles of 94°C 
for 30 sec, 60°C for 30 sec and 72°C for 45 sec with a final extension at 72°C for 5 min. 
All PCR products were electrophoresed on 1.5% w/v agarose gels, transferred to 
nitrocellulose and analyzed by Southern blot using 32 P-radiolabeled internal 
oligonucleotides as probes. Membranes were then autoradiographed for 2 hr at -70°C. 

The following primers were used to amplify probes for cDNA library screening and for 
RT-PCR:- 

human p49 mgr 

5 '-GAAGTCTTTGATGCCCTGATG-3 ' [SEQ ID NO: 1 9] 

5'-AACCCATTCCCTCGACATAGA-3' [SEQ ID NO:20] 



human p70 mgr 

5-AGCGCGATGACACAGGAGTA-3 9 
5 '-CGTTGCTATGGAGACAGTGA-3 ' 



[SEQIDNO:21] 
[SEQ ID NO:22] 
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human bom 

5 ' -CCGTTT AACAAGGAC ACTGC-3 ' 
5 ' -CTGGAAGCC ACC AAATCTCT-3 ' 

5 murine p 70 mgr 

5 ' - AGCGCGATGAC AC AGGAGT A-3 ' 
5 ' -AGTGCC AGAGCTGAACTGAT-3 ' 

murine p61 mgr 
10 5 ' -TCC ATGGGTTCCTTGAGTTC-3 ' 
5'-AGTGCCAGAGCTGAACTGAT'-3' 

murine bom 

5 ' -AAAGGGGAGCGAGTTC ATTG-3 ' 
15 5 ' - AGAGCTCTCGGTGATGGAT A-3 ' 



EXAMPLE 2 
Cloning of human and murine mgr and bom 

20 

Human p49 mgr was cloned from a fetal brain cDNA phage library in the AZAP n vector 
(Stratagene). The cDNA encoding the longer human MGR isoform was amplified by RT- 
PCR from human kidney mRNA. The cDNA encoding the smaller murine isoform of 
MGR was cloned from a 17.5-day embryo phage library in the Lambda TripelEx vector 
25 (Clontech). The murine p70 cDNA was amplified from murine kidney mRNA by RT-PCR. 
The human bom cDNA was isolated form a placental phage library in the Lambda ZAP II 
vector (Stratagene) and the murine cDNA from an embryonic carcinoma cell line (PI 9) 
phage library in the Uni-ZAP XR vector (Stratagene). The murine MGR genomic locus 
was obtained from a 129SVJ phage library in the Lambda FIX II vector (Stratagene). 

30 



[SEQIDNO:23] 
[SEQIDNO:24] 



[SEQIDNO:25] 
[SEQIDNO:26] 



[SEQ ID NO:27] 
[SEQ ID NO:28] 



[SEQIDNO:29] 
[SEQIDNO:30] 
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From similarity searches of GenBank databases, using the GRH protein sequence as a 
query, two murine expressed sequence tag (EST) entries were found from adult brain and 
ovary and one human EST entry from fetal brain that were not identical to any previously 
reported genes, yet shared high degrees of homology with each other and grh. These 

5 sequences were used to design murine and human primers and amplified probes from 
murine adult brain and ovary and human adult brain cDNA. The murine probe from adult 
brain cDNA was used in a screen of a day 17.5 mouse embryo cDNA library to obtain a 
full length clone of a gene referred to as mammalian grainyhead (mgr) due to its sequence 
and functional homology and similar expression pattern to that of the fly gene. The human 

10 probe derived from adult brain cDNA was used to obtain a full length cDNA clone from a 
human fetal brain library. Amino acid sequence comparison reveals this to be the human 
homolog of MGR with 94% identity at the amino acid level. 

The murine probe derived from ovary cDNA was used in a screen of a murine 
15 teratocarcinoma cell line (PI 9) cDNA library to obtain a frill length clone of a novel gene 
distinct from but highly related to mgr named brother-of-mgr {bom). The homology 
between mgr and bom suggests that mgr and bom arose through gene duplication. 

The human homolog of bom was obtained using primers derived from a high throughput 
20 genome sequencing (HTGS) database entry with homology to murine bom. These were 
used to amplify a probe from a human placental cDNA library that was then screened to 
yield a full length human cDNA clone. Amino acid sequence comparison between murine 
and human BOM revealed 94% identity. 

25 The sequence alignments between grh, mgr, bom, CP2 and LBP-la revealed that mgr and 
bom are more closely related to grh than the previously identified homologs CP2 and LBP- 
la (Table 4). This homology is particularly evident in the DNA binding and dimerization 
domains emphasizing the importance of protein/protein and protein/DNA interactions for 
the function of these factors. 
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TABLE 4 Amino acid sequence comparison of GRH-like genes and Drosophila grh 



Amino acid identity/ 
similarity to Grainyhead (%) 


Overall 


DNA-binding domain 


Dimerization domain 


MGR 


37/52 


48/64 


39/61 


BOM 


35/52 


46/63 


37/61 


SOM 


33/48 


42/60 


38/57 


CP2 


26/42 


32/52 


29/47 


LBP-la 


23/39 


31/51 


28/43 



EXAMPLE 3 

5 Identification of a second isoform of MGR 



A striking feature of the alignment between MGR and BOM was the absence of an MGR 
domain corresponding to the first 93 amino acids of BOM. In view of the absence of 
tissue-specific isoforms of GRH, the EST database was searched for similar sequences 

10 using the 5' end of bom as a query. A highly similar but non-identical sequence in an EST 
from murine kidney was located. The most 3' 30 nucleotides of this EST was identical to 
30 nucleotides close to the 5' end of the mgr. Based on this, primers were designed from 
the kidney EST and mgr cDNA sequences and amplified a product of the predicted size 
from murine kidney cDNA. A similar product was also amplified from human kidney 

15 cDNA. Amino acid sequence analysis of the murine product revealed that it was highly 
homologous to the 5' end of the BOM protein and contiguous with the mgr open reading 
frame. However, it lacked the first 11 amino acids of a previously isolated mgr clone 
suggesting the presence of alternate splicing. To examine this, the murine mgr genomic 
locus was isolated and mapped. As shown in Figure IB, the first three coding exons in the 

20 locus are exclusive to the p70 isoform of mgr. In contrast, the shorter isoform of /ngr's 
(p61) first coding exon is absent in the p70 isoform. Significantly, the 5' end of this exon 
lacks a splice acceptor site explaining its absence from the longer isoform. Instead, 
promoter sequences with a clear TATA box and CAP site are evident in close proximity to 
the translation initiation site (Figure 1C). Subsequent mapping of the human genomic locus 

25 revealed that murine exon four was conserved in the human p70 protein but was absent in 
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the 49 kDa isoform of MGR. 

EXAMPLE 4 
The first three exons of the mgr genomic locus encode 
5 transcriptional activation domain 

Although significant sequence homology exists between grh and the shorter mgr isoforms 
and p70 mgr, the isoleucine rich transcriptional activation domain identified in the fly 
protein is not conserved. Examination of the MGR-coding sequences failed to reveal a 

10 region homologous to other known transactivation domains. In view of the high degree of 
conservation of the first three coding exons of p70 mgr and bom, it was postulated that this 
could be the functional domain responsible for activation. To address this, the cDNA 
fragment encoding the first 93 amino acids of human p70 MGR (encoded by the first three 
exons) was subcloned in frame with the GAL4 DNA binding domain in a mammalian 

15 expression vector. The comparable region of BOM and the full length p49 MGR cDNA in 
frame into this vector was also cloned. These plasmids were co-transfected into the human 
293T cell line with a reporter plasmid containing five concatamerized GAL4 DNA binding 
sites upstream of the chloramphenicol acetyltransferase (CAT) gene. The vector containing 
only the GAL4 DNA-BD or containing the VP 16 activation domain fused to the GAL 

20 DNA-BD served as the negative and positive controls, respectively. As shown in Figure 3, 
transcriptional activation of the CAT gene was observed with VP 16, p70 MGR and the 
bom containing plasmids. No activation was observed with p49 mgr or the empty vector. 
These findings confirm the presence of a highly conserved activation domain in the p70 
mgr and bom that is lacking in p49 mgr. 

25 

EXAMPLE 5 
MGR binds to known GRH binding sites 

To determine the extent of the functional homology between GRH and MGR, it was 
30 initially examined whether the mammalian protein could bind to the well-characterized 
binding sites for the Drosophila factor in the Dopa decarboxylase and PCNA gene 
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regulatory regions (Uv et al, Mol Cell Biol 17: 6727-6735, 1997; Hayashi et al, J. Biol 
Chem. 274: 35080-35088, 1999). Oligonucleotide probes encompassing these sites were 
incubated with nuclear extract from the human placental cell line JEG-3, which expresses 
both isoforms of MGR at KNA and protein level and analyzed in an electrophoretic 
5 mobility shift assay (EMSA). 

EMSA were performed as previously described (Jane et al, EMBO J. 14: 97-105, 1995) 
with the following oligonucleotide probes (sense strand only given): Drosophila dopa 
decarboxylase promoter (Uv et al, 1997, supra) - GGTGGTGCTCTAATAACCGGTTT- 

10 CCAAGATGCGC (SEQ ID NO:31]; Drosophila PCNA promoter (Hayashi et al, 1999, 
supra) - GGGTAAAAAGTGTGAACAATCAAACCAGTTGGCA (SEQ ID NO:32]; 
human Engrailed- 1 promoter (Logan et al, Dev. Genet 13: 345-358, 1992) - 
GGACACACACCCAAACCCACACCCACCCACAAACACACAAACCGGCAGTGAC 
AACAACCACCCATCCTTCAATAACAGCAACCA [SEQ ID NO:33]. In some assays, 

15 anti-MGR polyclonal antiserum was included in the reaction mix. Two antisera were used 
for this purpose: antisera 611 - raised against peptides common to the p70 and p49 MGR 
proteins in the dimerization domain; and antisera 67 raised against unique peptides in the 
NH 2 -terminal domain of p70 MGR. Nuclear extract for these assays was obtained from the 
human placental cell line, JEG-3. 

20 

As shown in Figure 2A, a specific protein/DNA complex was observed with the PCNA 
probe in the presence of pre-immune sera (lanes 1 and 3). This complex was supershifted 
with the addition of anti-p70 specific antisera raised against peptides in the amino terminal 
region of the protein (lane 4) and ablated with the addition of anti-MGR antisera raised 
25 against peptides common to p49 and p70 MGR in the dimerization domain of the protein 
(lane 2). Neither antisera cross-reacted with BOM. Similar results were obtained with the 
Dopa decarboxylase promoter probe (Figure 2B). 
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EXAMPLE 6 
MGR binds to the human Engrailed-1 promoter 

Many Drosophila genes regulated by GRH have known mammalian homologs. In terms of 
5 functional homology, Engrailed-1 (En-1) is one of the bests characterized. The En-1 
promoter was, therefore, examined for the grainyhead consensus DNA binding sequence 
derived from a comparison of the Drosophila Ultrabithorax, Dopa decarboxylase and 
fushi tarazu promoters (Dynlacht et al, Genes Dev. 3: 1677-1688, 1989). As shown in 
Figure 3A, a highly conserved region was identified in the proximal En-1 promoter. 
10 Moreover, this sequence was also largely conserved in the DNAsel footprint attributed to 
grh in the Drosophila engrailed promoter (Soeller et al, Genes Dev. 2: 68-81, 1988). The 
ability of this region of the human En-1 promoter to compete off MGR binding to the Ddc 
probe (Figure 3B) in an EMSA with nuclear extract from JEG-3 cells was examined. As 
shown in Figure 3B, the specific MGR/DNA complex observed with the Ddc probe (lane 
15 1) was supershifted with the addition of MGR antisera 67 (lane 2) and ablated with the 
addition of a 50-fold excess of unlabeled Ddc probe as competitor (lane 3). Addition of a 
10- (lane 4) or 20-fold (lane 5) excess of unlabeled En-1 probe also markedly reduced the 
binding of MGR to the Ddc probe. 

20 EXAMPLE 7 

MGR activates transcription 

To determine the functional significance of this binding, this region of the En-1 promoter 
was linked to a minimal globin gene promoter/luciferase reporter gene construct and 

25 transfected it into the MGR null cell line COS, in the presence of p70 MGR mammalian 
expression vector or the empty vector. Transfection of the minimal promoter/reporter or 
the TK promoter linked to a Renilla luciferase gene with either vector served as the 
controls. As shown in Figure 3C, expression of p70 MGR dramatically enhanced the 
transcriptional activity of the En-1 promoter (solid bars) but not the control minimal 

30 promoter (open bars) or the TK promoter (hatched bars). 
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EXAMPLE 8 
Cloning of full-length human SOM 

Hitman SOM was cloned using primers derived from a high through-put genomic sequence 
(HTGS) and a human expression sequence tag (EST) obtained from GenBank databases 
which, respectively, aligned with the dimerization domain and the activation domain of 
other MGR members. Using nested RT-PCR and human tonsil cDNA, another contig 
spanning 1300 nucleotides was obtained. 

Utilizing 5' RACE, further oligoprimers and human testis cDNA, a 210 nucleotide 
sequence incorporating the initiating ATG was obtained. A contig of these overlapping 
sequences revealed the full length human SOM which upon alignment with other existing 
MGR family members showed >60% similarity at the protein level with conservation at 
the 5' activation, DNA-binding and dimerization domains. 

EXAMPLE 9 
Cloning of full-length murine SOM 

A murine EST (GenBank) from optic cup tissue was identified, which when aligned with 
other murine homologs of the MGR family showed 70% similarity at the amino acid level, 
in the region of the DNA binding domain. Using semi-nested RT-PCR with murine testis 
cDNA, a 286 nucleotide sequence was amplified, cloned and sequenced for use as a probe. 

Subsequently, a murine brain cDNA library (Stratagene) was screened. One clone was 
taken through to quaternary stage. This clone was excised from A.ZAP II vector into 
pBluescript and sequenced in both directions. A 1200 nucleotide length sequence was 
obtained, whichi lacked the 5* end. This was subsequently identified using 5' RACE from 
murine testis cDNA. A contig of these two sequences revealed the full length murine 
SOM. 
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Those skilled in the art will appreciate that the invention described herein is susceptible to 
variations and modifications other than those specifically described. It is to be understood 
that the invention includes all such variations and modifications. The invention also 
includes all of the steps, features, compositions and compounds referred to or indicated in 
5 this specification, individually or collectively, and any and all combinations of any two or 
more of said steps or features. 

EXAMPLE 10 
Generation of mice heterozygous at the GRHL-3 locus 

10 

The murine GRHL-3 locus was isolated by screening a 129/SV/J genomic library with a 
cDNA fragment derived from the 5' end of the gene. Polymerase chain reaction (PCR) was 
then used to generate a 5.8 kb Notl-BamHI fragment that when cloned into the plasmid 
ppgalpAloxneo, fused the second codon of GRHL-3 exon 2 to the ATG of p-galactosidase. 

15 The 3' flanking region was a Satl-Kpnl fragment extending 2.6 kb from the beginning of 
intron 3. The thymidine kinase gene driven off the MCI -promoter was inserted into the 
targeting construct distal to the 3' arm as a Sacll-Noil fragment and the vector was 
linearized with NotI and electroporated into W9.5 embryonic stem cells. Transfected cells 
were selected in G418 and resistant clones picked and expanded. Clones were identified in 

20 which the targeting vector had recombined with the endogenous GRHL-3 gene by 
hybridising Spel digested genomic DNA with a 0.5 kb Spel-SaK fragment situated in the 5' 
GRHL-3 genomic sequence just outside the targeting vector. This probe distinguished 
between the endogenous (8.4 kb) and targeted (11.5 kb) GRHL-3 alleles. Two correctly 
targeted embryonic stem cell clones were injected into C57BL/6 blastocysts to generate 

25 chimeric mice. Male chimeras were mated with C57BL/6 females to yield GRHL-3 
heterozygotes which were identified by hybridising ita/?iHI-digested genomic DNA from a 
tail biopsy with a 0.85 kb Ncol fragment situated in the 3 ? GRHL-3 genomic sequence just 
outside the targeting vector. This probe distinguished between the endogenous (5.2 kb) and 
targeted (10.7 kb) alleles. Heterozygous mice were bred with Cre deleter transgenic mice 

30 to excise the Neo R cassette. GRHL-3 heterozygotes in which the Neo R cassette had been 
deleted were interbred to produce wild type (GRHL-3 +/+ ), heterozygous (GRHL-3 +/ ~) and 
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mutant (GRHL-3" 7 ") mice. The inability of the targeted allele to produce GRHL-3 
messenger RNA was confirmed in nucleic acid blots. 

C57BL/6J inbred mice were obtained from the Walter and Eliza Hall Institute animal 
facility, and the ct/ct mouse stock from the Jackson Laboratory. All experiments were 
5 approved by the Melbourne Health Animal Ethics Committee. 

EXAMPLE 11 
Genotyping GRHL-3 mutant mice 

10 Mice were genotyped by PCR using genomic DNA template prepared from tail biopsies or 
embryonic tissues. Products of 812 bp were generated from the wild type GRHL-3 allele 
and/or a product of 579 bp was generated from the targeted GRHL-3 allele. Primers used 
were specific for intron 1, common to the wild type and targeted GRHL-3 alleles (sense, 
5 '-GGATC AGAAGACCATGCC-3 ') (SEQ ID NO:40); intron 2, deleted from the targeted 

15 GRHL-3 allele (antisense, 5-AGGCTGTT AGAGTTGGTG-3 ') (SEQ ID NO:41); and the 
lacZ cassette, present only in the targeted GRHL-3 allele (antisense, 5'- 
CTGTAGCCAGCTTTCATC-3') (SEQ ID NO:42). PCR conditions were 94 °C for 2 
minutes followed by 35 cycles of 94 °C for 30 seconds, 55 °C for 30 seconds and 72 °C for 
1 minute with a final 5 minutes extension at 72 °C. 

20 

EXAMPLE 12 
Inositol and folate administration during pregnancy 

GRHL-3 +/ " mice were inter-crossed and folate, inositol or PBS placebo was administered to 
25 pregnant females as previously described. Embryos were harvested on E14.5 and 
genotyped and examined morphologically. Mean litter size and frequency of resorptions 
did not differ significantly for folate-, inositol- or placebo-treated litters. The results were 
analysed statistically by the one-sided binomial probability test. 
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EXAMPLE 13 
GRHL-3 Northern hybridization andRT-PCR 

A unique GRHL-3 cDNA probe from nucleotides 404 to 889 was hybridized to a blot 
5 containing 35 ^ig of total RNA from E14.5 ct/ct embryos. RT-PCR was performed on 
cDNA from DNasel-treated total RNA (2 jag) isolated from whole E9.5 embryos (Rneasy, 
Qiagen) using a First Strand Synthesis for RT-PCR kit (Amersham). One-tenth of the total 
cDNA was used as the template in PCR reactions containing primers specific for HPRT 
(sense, 5 -GCTGGTGAAAAGGACCTCT-3 1 (SEQ ID NO:43); antisense, 5 f - 
10 CACAGGACTAGAACACCTGC-3') (SEQ ID NO:44). The cDNA sample was then 
diluted to give similar amplification of HPRT under the same PCR conditions prior to use 
in PCR reactions containing GRHL-3 -specific primers. E9.5 GRHL-3 +/ ~ and GRHL-3" 7 " 
embryo cDNA was amplified with specific primers annealing to exon 8 and exon 13 
(sense; 5 -C AC ATTGAAGAGGTGGC-3 f (SEQ ID NO:45); antisense, 5'- 
15 AAGGGTGAGCAGGTTCGCTT-3') (SEQ ID NO:46). PCR conditions were 94 °C for 2 
minutes followed by various cycles of 94 °C for 30 seconds, 60 °C for 30 seconds and 72 
°C for 1 minute. All PCR products were electrophoresed on 1.5% agarose gels, transferred 
to nitrocellulose membranes and analyzed by Southern blotting using P-labelled internal 
oligonucleotides as probes. 

20 

Quantitative Real-Time RT- PCR was performed in a Rotorgene 2000 (Corbett Research, 
Australia) in a final volume of 20 |il. Reaction mixtures comprised IX reaction buffer plus 
2.5 mM (HPRT) or 3 mM (GRHL-3) MgCl 2 > 0.05 mM dNTPs (Roche), 0.1 gene- 
specific primers, 1U Taq (Fisher Biotech, Australia), a 1/10000 dilution of SYBR Green I 
25 (Molecular Probes, USA), and 2 jliI of sample or standard. Cycling conditions were 94 °C 
for 15 seconds, 55 °C (GRHL-3) or 52 °C (HPRT) for 30 seconds and 72 °C for 30 
seconds. For each reaction, standard curves were generated and relative quantities of each 
transcript were calculated from this. The ratio of GRHL-3/HPRT normalised to wild type 
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E14.5 is shown. Error bars show the sum of the standard deviations for each sample as a 
proportion of the normalised signal. The identity of the PCR products was confirmed by 
melt curve analysis and agarose gel electrophoresis on a 1.5% agarose gel. 

EXAMPLE 14 

Histology, in situ hybridization and whole mount skeletal staining ofGRHL-3 mutant 

mice 

Embryos from timed pregnant C57BL/6J females and GRHL-3 +/ ~ intercrosses were 
immersion fixed in 4% paraformaldehyde in phosphate buffered saline, pH 7.3. The 
embryos were then embedded in paraffin wax before 8 jam sections were cut on a 
microtome and placed on gelatine-coated slides. For histological analysis, sections were 
stained with hematoxylin and eosin. In situ hybridisation was performed as described 
previously. A radio-labelled GRHL-3 antisense RNA probe was transcribed from a 
pBluescript II SK plasmid (Stratagene) carrying a 485 bp fragment of the GRHL-3 coding 
region (nt 404 to 889) using T7 RNA polymerase. All sections for in situ analysis were 
counter-stained with hematoxylin. In situ hybridisation was also performed using a full- 
length GRHL-3 probe and the same expression pattern was observed (data not shown). 
Hybridisation signal was similar to background levels in embryos homozygous for the 
GRHL-3 mutation. Whole mount skeletal staining on El 7.5 embryos was performed as 
described previously. 

EXAMPLE 15 
Expression of GRHL-3 during mouse development 

One criterion for defining specific neurulation genes is that they are expressed in the 
region of the folding neural plate at the appropriate developmental time point. To 
determine the pattern of expression of GRHL-3 during murine development, in situ 
hybridisation studies were performed using a probe specific for GRHL-3 (Fig. 4). In 
embryos at E8, it was observed that expression was confined to the non-neuronal ectoderm 
immediately adjacent to the neural plate that was undergoing folding to form the neural 
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tube (Fig. 4A,B). At later time-points, more widespread expression was observed in the 
surface ectoderm with a progressive increase from E12.5 to E15.5 (Fig. 4C£>). The 
pattern of expression at later time points is similar to the expression profiles of murine 
GRHL-1 and -2 (Ref. 8). 

5 

EXAMPLE 16 
NTDs in GRHL-3 mutant mice 

To determine the functional role of GRHL-3 during mouse development, a 2.2 kb deletion 
10 in GRHL-3 was generated by gene targeting (Fig. 5A-D). Northern blot and RT-PCR 
analysis indicated that the targeted GRHL-3 allele represented a null mutation (Fig. 5E,F). 
Genotyping of offspring from GRHL-3 +/ " intercrosses from mid and late gestation, showed 
that GRHL-3"'" mice were represented in Mendelian proportions up to El 8.5. Of 874 
embryos examined on, or before this time, 191 (22%) were genotyped as GRHL-3 7 ". No 
15 GRHL-3^" embryos survived to weaning. 

All GRHL-3"'" pups, without exception, displayed neural tube defects (NTDs). GRHL-3 +/ " 
mice were indistinguishable from their wild-type littermates. As most of the newborn 
GRHL-3 V " pups were cannibalised by their mothers, we examined the phenotype in more 

20 detail in developing embryos (Fig. 6A). All GRHL-3"'" embryos exhibited thoraco-lumbo- 
sacral spina bifida and curled tail and 3% had co-incident exencephaly. GRHL-3 V " 
embryos were also smaller than their littermate controls. Full body skeletal preparations 
demonstrated abnormalities in the vertebral column with kyphosis, splayed spinal 
processes and lack of vertebral arch formation (Fig. 6B,C). Transverse sections through the 

25 thoracic, lumbar and sacral regions at different developmental time points showed that 
spina bifida in GRHL-3"'" embryos was due to a primary failure of neural tube closure. The 
neural plate, appeared to furrow normally with the formation of the median hinge point, but 
neural fold elevation did not occur and the neuro-epithelium remained convex throughout 
gestation (Fig. 6D). 

30 
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EXAMPLE 17 
The GRHL-3 gene is allelic with the ct gene 

Analysis of GRHL-3 mutant mice revealed phenotypic similarities between the GRHL-3" 7 " 
5 embryos and those reported in the curly tail mouse mutant 4 . The curly tail gene is 
incompletely penetrant, with homozygotes developing lumbo-sacral spina bifida aperta 
(12%), curled tail (50%) and exencephaly (3%). The ct locus had been mapped to murine 
chromosome 4, at position 63.4, close to the D4Mit69 marker. The NCBI STS database to 
ascertain the chromosomal localisation of GRHL-3 in mice. The mGRHL-3 gene is also 
10 located on chromosome 4, approximately 3kb from D4Mitl57 at position 63.4. This 
marker, which lies within 800 kb of D4Mit69, had not been included in the original curly 
tail mapping studies. Both markers were subsequently identified in a recently deposited 13 
Mb contiguous sequence from chromosome 4 (Fig. 7A). Also contained in this sequence 
were several genes that have previously been studied (and excluded) as ct candidate genes. 
15 GRHL-3 was positioned closer to the D4Mit69 marker than all of these excluded 
candidates. Genetic complementation studies were therefore performed using mice 
heterozygous for the null GRHL-3 allele and ct/ct mice. Embryos from these matings were 
harvested between El 1.5 and El 8.5 days and genotyped and examined morphologically. 
Among the 101 embryos obtained, NTDs were the only gross abnormalities observed (Fig. 
20 7B). These were confined, without exception, to embryos with a GRHL-3 +/ 7cf genotype 
(Table 5). The incidence of spina bifida in mice carrying both mutant alleles was higher 
than reported for curly tail homozygotes (31% versus 12%), but the extent of the defect 
more closely resembled that of the ct/ct mice (lumbo-sacral spina bifida) than the GRHL-3" 
'" mice (thoraco-lumbo-sacral spina bifida). Tail flexion defects alone were identified in an 
25 additional 23% of embryos, all of which were genotyped as GRHL-3 +/ 7cf . Ten embryos 
carrying both mutant alleles appeared morphologically normal and the remaining 37% of 
embryos were unremarkable and genotyped as GRHL-3 +/+ /c*. 

The expression of GRHL-3 in curly tail homozygous embryos was compared with wild 
30 type and GRHL-3 +/ " controls by Northern blotting (Fig. 7C). Densitometry of the GRHL-3 
signal normalised to the 28S signal obtained by Phosphorimager analysis revealed a 
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significant reduction in the level of GRHL-3 mRNA in ct/ct embryos (19-34%) compared 
with wild type or GRHL-3 heterozygous controls. Real-time quantitative RT-PCR was 
performed on mRNA from these embryos and confirmed that the level of GRHL-3 
expression in ct/ct embryos was reduced approximately 3-fold compared to wild type 
5 controls (Fig. 7D). 

TABLE 5 Phenotypes of embryos from GRHL-3 +A X ct/ct crosses 



Phenbtype 


Number 


Genotype 


Spina bifida + curly tail 


31 


GRHL-3 +/ 7cf-31 


Curled tail 


23 


GRHL-3 +/ 7c*-23 


Normal 


47 


GRHL-3 +/ 7cf - 10 
GRHL-3 +/+ /c*- 37 


Total 


101 


101 



Embryos were harvested between El 1.5 and El 8.5. 

10 



EXAMPLE 18 

NTDs in GRHL-3*' embryos are folate- and inositol-resistant 

15 NTDs in the curly tail mice are resistant to folate administered in early gestation. 
However, inositol therapy in pregnancy results in a marked reduction in the incidence of 
spina bifida. The effects of folate and inositol administration on pregnant GRHL-3 +/ " mice 
previously mated with GRHL-3 +/ ~ males was examined (Table 6). As expected, no rescue 
of spina bifida in GRHL-3* 7 " embryos with placebo or folate treatment was observed. In 
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contrast to the ct/ct mice, neither the incidence, nor severity of the spina bifida in the 
GRHL-3 7 " embryos was alleviated by inositol treatment. Although the numbers of embryos 
examined was small, the result was highly significant (pO.OOl) given the 70% reduction 
in the incidence of spina bifida in inositol-treated ct/ct embryos. These findings indicate 
that GRHL-3 expression is essential for inositol-mediated rescue of folate-resistant NTDs. 



TABLE 6 Effects of folate and inositol administration on NTDs in GRHL-3" 7 " mice 



Genotype 
GBHL-y 1 - 


Placebo 

n=5 


Folate 
n=4 


Inositol 

n=8 . 


Predicted NTDs 


100% 


100% 


30% | 


Observed NTDs 


100% 


100% 


100%* 



*p < 0.001 
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CLAIMS 

1. An isolated nucleic acid molecule comprising a sequence of nucleotides encoding 
or complementary to a sequence encoding a mammalian homolog of Drosophila 
grh. 

2. The isolated nucleic acid molecule of claim 1 wherein the Drosophila grh is 
selected from grainyhead (mgr), brother of mgr (bom) and sister of mgr (som) or a 
mutant, derivative, homolog or analog thereof. 

3. The isolated nucleic acid molecule of claim 1 wherein the nucleotide sequence of 
Drosophila grh is selected from SEQ ID NO: 17, SEQ ID NO: 34, SEQ ID NO: 36 
and SpQ ID NO: 38. 

4. The isolated nucleic acid molecule of claim 3 wherein the mammalian homolog 
comprises a nucleotide sequence having at least 65% identity after optimal 

. alignment to one or more of SEQ ID NO: 17, SEQ ID NO: 34, SEQ ID NO: 36 and 
SEQ ID NO: 38 or comprises a nucleotide sequence capable of hybridzing to SEQ 
ID NO: 17, SEQ ID NO: 34, SEQ ID NO: 36 and/or SEQ ID NO: 38 or a 
complementary form thereof under stringency conditions. 

5. The isolated nucleic acid molecule of claim 1 or 2 or 3 or 4 wherein the nucelic 
acid molecule encodes a mammalian protein transcription factor selected from 
human MGR p49 (SEQ ID NO: 2), human MGR p70 (SEQ ID NO: 4), human 
BOM (SEQ ID NO: 6), human SOM (SEQ ID NO: 7), murine MGR p61 (SEQ ID 
NO: 10), murine MGR p70 (SEQ ID NO: 12), murine BOM (SEQ ID NO: 14) and 
murine SOM (SEQ ID NO: 16). 
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6. An isolated nucleic acid molecule comprising a sequence of nucleotides encoding a 
polypeptide having transcription factor activity and comprising an amino acid 
sequence substantially as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14 or SEQ ID NO:16 
or an amino acid sequence having at least about 60% similarity to SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, 
SEQ ID NO:14 or SEQ ID NO:16 after optimal alignment wherein said polypeptide 
is a mammalian homolog of Drosophia GRH. 

7. An isolated nucleic acid molecule encoding a mammalian transcription factor 
homolog of Drosophila grh and comprising a nucleotide sequence selected from 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ 
ID NO:l 1, SEQ ID NO:13 and SEQ ID NO:15 or a nucleotide sequence having at 
least about 60% similarity to any one of SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13 or SEQ ID 
NO: 15 after optimal alignment or a nucleotide sequence capable of hybridizing to 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ 
ID NO:ll, SEQ ID NO:13 or SEQ ID NO:15 or a complementary form thereof 
under low stringency conditions. 

8. An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 1. 

9. An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 3. 

10. An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 5. 

11. An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 7. 
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An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 9. 

An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 11. 

An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 13. 

An isolated nucleic acid molecule comprising the nucleotide sequence setforth in 
SEQ ID NO: 15. 

A method for identifying a M-GRH, said method comprising screening a nucleotide 
database and identifying a nucleotide sequence having at least 60% similarity to 
SEQ ID NO: 17 or SEQ ID NO:34 or SEQ ID NO:36 or SEQ ID NO:38 after 
optimal alignment. 

Means of identifying a nucleotide sequence likely to encode an M-GRH 
transcription factor, said method comprising interrogating a mammalian genome 
database conceptually translated into different reading frames with an amino acid 
sequence defining Drosophila GRH or any one of SEQ ID NO:2, SEQ ID NO:4, 
SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14 and 
SEQ ID NO: 16 and identifying a nucleotide sequence corresponding to an amino 
acid sequence having at least about 60% similarity to Drosophila GRH or to any 
one of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO: 10, SEQ ID NO:12, SEQ ID NO:14 and SEQ ID NO:16. 
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18. A method for detecting an aberrant phenotype or a propensity for an aberrant 
phenotype to. develop, said method comprising screening for a variation in a 
nucleotide sequence encoding a mammalian MGR, BOM and/or SOM or their 
homologs. 

19. An isolated mammalian transcription factor which is a homolog of Drosophilia 
grainyhead (GRH) selected from human MGR p49 (SEQ ID NO: 2), human MGR 
p70 (SEQ ID NO: 4), human BOM (SEQ ID NO: 6), human SOM (SEQ ID NO: 
8), murine MGR p61 (SEQ ID NO: 10), murine MGR p70 (SEQ ID NO: 12), 
murine BOM (SEQ ID NO: 14) and murine SOM (SEQ ID NO: 16). 

20. An antibody to be isolated transcription factor of claim 19. 

21 . The antibody of claim 20 wherein the antibody is a monoclonal antibody. 

22. A method for detecting an aberrant phenotype or a propensity for an aberrant 
phenotype to develop, said method comprising screening for a variation in an 
amino acid sequence encoding MGR, BOM and/or SOM or their homologs. 

23. A method for detecting a mammalian transcription factor or fragment thereof in a 
biological sample from a subject, said method comprising contacting said 
biological sample with an antibody specific for said mammalian transcription factor 
or fragment thereof or its derivatives or homologs for a time and under conditions 
sufficient for an antibody-polypeptide complex to form, and then detecting said 
complex. 

24. An animal model comprising a genetically modified animal comprising a 
nucleotide insertion, deletion, addition and/or substitution in a nucleic acid mole of 
claims 1 to 15. 

25. A medical assessment system comprising the animal model of claim 24. 
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SEQUENCE LISTING 

<110> Melbourne Health 

Jane , Stephen (US Only) 
Wilanowski, Tomasz (US only) 
Ting, Stephen (US only) 

<120> Diagnostic and Therapeutic Agents 

<130> 12301370/EJH 

<150> US 60/402055 
<151> 2002-08-09 

<150> 2002951579 
<151> 2002-08-22 

<160> 46 

<170> Patentln version 3.1 

<210> 1 

<211> 1881 

<212> DNA 

<213> human 

<220> 

<221> CDS 

<222> (94) . . (1323) 

<223> 



<400> 1 

ataagagagg ccatctgaca gctccagata cgacagtcac tgtctccata gcaacgatgc 60 

ctacccactc catcaagaca gaaacccagc cac atg get teg ctg tgg gaa tec 114 

Met Ala Ser Leu Trp Glu Ser 
1 5 

ctg age ggg tgg tgg ttt teg 162 

Leu Ser Gly Trp Trp Phe Ser 
20 

get ctg gtg etc aag ccc caa 210 

Ala Leu Val Leu Lys Pro Gin 
35 

cct tct cag aga cct tea agg 25 8 

Pro Ser Gin Arg Pro Ser Arg 
50 55 

tat ace eta gaa get tea aaa 306 
Tyr Thr Leu Glu Ala Ser Lys 
65 70 

acc atg acg tac ctg aac aaa 354 
Thr Met Thr Tyr Leu Asn Lys 



ccc cag cag tgt ate ate ctg age cca 
Pro Gin Gin Cys lie lie Leu Ser Pro 
10 15 

ate gga ate tea ata ctg acc agt tea 
lie Gly lie Ser lie Leu Thr Ser Ser 
25 30 

atg etc aaa ggc gaa etc cag act cga 
Met Leu Lys Gly Glu Leu Gin Thr Arg 
40 45 

aag gcg ttc agg agg aac aac ttt gaa 
Lys Ala Phe Arg Arg Asn Asn Phe Glu 
60 

tea ctt cga cag aag cca gga gac agt 
Ser Leu Arg Gin Lys Pro Gly Asp Ser 
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75 80 85 

ggc cag ttc tat ccc ate acc ttg aag gag gtg age age agt gaa gga 4 02 

Gly Gin Phe Tyr Pro lie Thr Leu Lys Glu Val Ser Ser Ser Glu Gly 
90 95 100 

ate cat cat ccc ate age aaa gtt cga agt gtg ate atg gtg gtt ttt 450 
He His His Pro He Ser Lys Val Arg Ser Val He Met Val Val Phe 
105 ' 110 115 

get gaa gac aaa age aga gaa gat cag tta agg cat tgg aag tac tgg 4 98 

Ala Glu Asp Lys Ser Arg Glu Asp Gin Leu Arg His Trp Lys Tyr Trp 
120 125 130 135 

cac tec egg cag cac acc get aaa caa aga tgc att gac ata get gac 546 
His Ser Arg Gin His Thr Ala Lys Gin Arg Cys lie Asp He Ala Asp 
140 145 150 

tat aaa gaa age ttc aac act ate agt aac ate gag gag att gcg tat 594 
Tyr Lys Glu Ser Phe Asn Thr He Ser Asn He Glu Glu He Ala Tyr 
155 160 165 

aac gec att tec ttc aca tgg gac ate aac gat gaa gca aag gtt ttc 642 
Asn Ala He Ser Phe Thr Trp Asp He Asn Asp Glu Ala Lys Val Phe 
170 175 180 

ate tct gtg aac tgc tta age aca gat ttc tct tec cag aag gga gtg 690 
He Ser Val Asn Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly Val 
185 190 195 

aag ggg ttg cct ctt aac att caa gtt gat acc tat agt tac aac aac 73 8 

Lys Gly Leu Pro Leu Asn He Gin Val Asp Thr Tyr Ser Tyr Asn Asn 
200 205 210 215 

cgc age aac aag cct gtg cac egg gee tac tgc cag ate aag gtc ttc 78 6 

Arg Ser Asn Lys Pro Val His Arg Ala Tyr Cys Gin He Lys Val Phe 
220 225 230 

tgt gac aag gga get gag egg aaa ate agg gat gaa gaa cga aag caa 834 
Cys Asp Lys Gly Ala Glu Arg Lys He Arg Asp Glu Glu Arg Lys Gin 
235 240 245 

age aaa aga aaa gtt tct gat gtt aaa gtg cca ctg ctt ccc tct cac 882 
Ser Lys Arg Lys Val Ser Asp Val Lys Val Pro Leu Leu Pro Ser His 
250 255 260 

aag cga atg gat ate aca gtt ttc aaa ccc ttc att gat etc gat act 930 
Lys Arg Met Asp He Thr Val Phe Lys Pro Phe He Asp Leu Asp Thr 
265 270 275 

cag cct gtc etc ttc att cct gac gtg cac ttt gee aac ttg cag egg 978 
Gin Pro Val Leu Phe He Pro Asp Val His Phe Ala Asn Leu Gin Arg 
280 285 290 295 

ggc act cat gtc ctt ccc att gee tct gaa gaa ttg gag ggt gaa ggc 1026 
Gly Thr His Val Leu Pro He Ala Ser Glu Glu Leu Glu Gly Glu Gly 
300 305 310 
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tct gtc ttg aaa agg ggg ccg tac ggc aca gaa gat gac ttt get gtc 1074 
Ser Val Leu Lys Arg Gly Pro Tyr Gly Thr Glu Asp Asp Phe Ala Val 
315 320 325 

cct cct tct acc aag ctg gec egg ata gaa gaa cca aag aga gtg ctg 1122 
Pro Pro Ser Thr Lys Leu Ala Arg lie Glu Glu Pro Lys Arg Val Leu 
330 335 340 

etc tac gtt cga aag gag tea gaa gaa gtc ttt gat gee ctg atg etc 1170 
Leu Tyr Val Arg Lys Glu Ser Glu Glu Val Phe Asp Ala Leu Met Leu 
345 350 355 

aaa acc cca tct ttg aag ggc ttg atg gaa get ate tea gac aaa tac 1218 
Lys Thr Pro Ser Leu Lys Gly Leu Met Glu Ala lie Ser Asp Lys Tyr 
360 365 370 375 

gat gtt ccc cat gac aag att ggg aaa ata ttc aag aag tgt aaa aag 12 66 

Asp Val Pro His Asp Lys lie Gly Lys lie Phe Lys Lys Cys Lys Lys 
380 385 390 

ggg ate ctg gtg aac atg gac gac aac att gtg aag cat tac tec aat 1314 
Gly lie Leu Val Asn Met Asp Asp Asn lie Val Lys His Tyr Ser Asn 
395 400 405 

gag gac acc ttccagctgc agattgaaga ageegggggg tcttacaagc 1363 
Glu Asp Thr 
410 



tcaccctgac 


ggagatctaa 


aggectgegg 


gccacagctc 


cccaggagtt 


cagtgcaggt 


1423 


gtttctagat 


cttacggttt 


ggcaactgea 


ggtaacccca gtcagccatg 


tcgccagcac 


1483 


aggtctatgt 


cgagggaatg 


ggttccttgc 


aggttggagg 


eggggctgea 


tctggcttgg 


1543 


tggtagcatt 


taatctattg 


cattggtgtt 


tttcagatga 


aagagaaatc 


catataccat 


1603 


tatgtttgaa 


tttcctgata 


tatacaggat 


ttaaagtgaa 


aactttattc 


caagagttaa 


1663 


cagagtctct 


gggaagcttt 


aggacatctg 


etaegttatt 


tatcaaaata 


ttgggatctc 


1723 


tgccttgtgc 


ctacagtgtc 


gtgggcctgc 


tegctagcag aagtcagaaa aggegatagg 


1783 


cttggcttta 


aggatttcgt 


gcccttgcct 


gaattcagta 


caactccact 


gcctcacgtt 


1843 


agegggageg 


cacctgaaga 


gtacgggggg 


agccctct 






1881 



<210> 2 

<211> 410 

<212> PRT 

<213> human 

<400> 2 



Met Ala Ser Leu Trp Glu Ser Pro Gin Gin Cys lie lie Leu Ser Pro 
15 10 15 
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Leu Ser Gly Trp Trp Phe Ser lie Gly lie Ser lie Leu Thr Ser Ser 
20 25 30 



Ala Leu Val Leu Lys Pro Gin Met Leu Lys Gly Glu Leu Gin Thr Arg 
35 40 45 



Pro Ser Gin Arg Pro Ser Arg Lys Ala Phe Arg Arg Asn Asn Phe Glu 
50 55 60 



Tyr Thr Leu Glu Ala Ser Lys Ser Leu Arg Gin Lys Pro Gly Asp Ser 
65 70 75 80 



Thr Met Thr Tyr Leu Asn Lys Gly Gin Phe Tyr Pro lie Thr Leu Lys 
85 90 95 



Glu Val Ser Ser Ser Glu Gly lie His His Pro lie Ser Lys Val Arg 
100 105 110 



Ser Val lie Met Val Val Phe Ala Glu Asp Lys Ser Arg Glu Asp Gin 
115 120 125 



Leu Arg His Trp Lys Tyr Trp His Ser Arg Gin His Thr Ala Lys Gin 
130 135 140 



Arg Cys lie Asp lie Ala Asp Tyr Lys Glu Ser Phe Asn Thr lie Ser 
145 150 155 160 



Asn lie Glu Glu lie Ala Tyr Asn Ala lie Ser Phe Thr Trp Asp He 
165 170 175 



Asn Asp Glu Ala Lys Val Phe He Ser Val Asn Cys Leu Ser Thr Asp 
180 185 190 



Phe Ser Ser Gin Lys Gly Val Lys Gly Leu Pro Leu Asn He Gin Val 
195 200 205 



Asp Thr Tyr Ser Tyr Asn Asn Arg Ser Asn Lys Pro Val His Arg Ala 
210 215 220 



Tyr Cys Gin He Lys Val Phe Cys Asp Lys Gly Ala Glu Arg Lys He 
225 230 235 240 
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Arg Asp Glu Glu Arg Lys Gin Ser Lys Arg Lys Val Ser Asp Val Lys 
245 250 255 



Val Pro Leu Leu Pro Ser His Lys Arg Met Asp lie Thr Val Phe Lys 
260 265 270 



Pro Phe lie Asp Leu Asp Thr Gin Pro Val Leu Phe lie Pro Asp Val 
275 280 285 



His Phe Ala Asn Leu Gin Arg Gly Thr His Val Leu Pro lie Ala Ser 
290 295 300 



Glu Glu Leu Glu Gly Glu Gly Ser Val Leu Lys Arg Gly Pro Tyr Gly 
305 310 315 320 



Thr Glu Asp Asp Phe Ala Val Pro Pro Ser Thr Lys Leu Ala Arg lie 
325 330 335 



Glu Glu Pro Lys Arg Val Leu Leu Tyr Val Arg Lys Glu Ser Glu Glu 
340 345 350 



Val Phe Asp Ala Leu Met Leu Lys Thr Pro Ser Leu Lys Gly Leu Met 
355 360 365 



Glu Ala lie Ser Asp Lys Tyr Asp Val Pro His Asp Lys lie Gly Lys 
370 375 380 



lie Phe Lys Lys Cys Lys Lys Gly lie Leu Val Asn Met Asp Asp Asn 
385 390 395 400 



.lie Val Lys His Tyr Ser Asn Glu Asp Thr 
405 410 



<210> 


3 


<211> 


2361 


<212> 


DNA 


<213> 


human 


<220> 




<221> 


CDS 


<222> 


(7) 


<223> 





<400> 



3 
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agcgcg atg aca cag gag tac gac aac aaa egg cca gtg ttg gtt ctt 4 8 

Met Thr Gin Glu Tyr Asp Asn Lys Arg Pro Val Leu Val Leu 
15 10 

cag aat gaa gca ctt tat cca cag egg egg tec tac act agt gag gat 96 
Gin Asn Glu Ala Leu Tyr Pro Gin Arg Arg Ser Tyr Thr Ser Glu Asp 
15 20 25 30 

gag gec tgg aaa tec ttc ctg gaa aac cct etc act gca gcg acc aaa 144 
Glu Ala Trp Lys Ser Phe Leu Glu Asn Pro Leu Thr Ala Ala Thr Lys 
35 40 45 

gcg atg atg age ate aat gga gat gaa gac age gee get gcg ctg ggc 192 
Ala Met Met Ser lie Asn Gly Asp Glu Asp Ser Ala Ala Ala Leu Gly 
50 55 60 

ctg etc tat gac tac tac aag gtt cca aga gag aga agg tea tea aca 240 
Leu Leu Tyr Asp Tyr Tyr Lys Val Pro Arg Glu Arg Arg Ser Ser Thr 
65 70 75 

gca aag cca gag gtg gag cac cct gag cca gat cac age aaa aga aac 2 88 

Ala Lys Pro Glu Val Glu His Pro Glu Pro Asp His Ser Lys Arg Asn 
80 85 90 

age at a cca att gtg aca gag cag ccc etc ate tct get gga gaa aac 33 6 

Ser lie Pro lie Val Thr Glu Gin Pro Leu lie Ser Ala Gly Glu Asn 
95 100 105 110 

aga gtg caa gta ctg aaa aat gtg cca ttt aac att gtc ctt ccc cat 384 
Arg Val Gin Val Leu Lys Asn Val Pro Phe Asn lie Val Leu Pro His 
115 120 125 

ggc aac cag ctg ggc att gat aag aga ggc cat ctg aca get tea gat 432 
Gly Asn Gin Leu Gly lie Asp Lys Arg Gly His Leu Thr Ala Ser Asp 
130 135 140 

acg aca gtc act gtc tec ata gca acg atg cct acc cac tec ate aag 480 
Thr Thr Val Thr Val Ser lie Ala Thr Met Pro Thr His Ser lie Lys 
145 150 155 

aca gaa acc cag cca cat ggc ttc get gtg gga ate ccc cca gca gtg 52 8 

Thr Glu Thr Gin Pro His Gly Phe Ala Val Gly He Pro Pro Ala Val 
160 165 170 

tat cat cct gag ccc act gag egg gtg gtg gtt ttc gat egg aay etc 576 
Tyr His Pro Glu Pro Thr Glu Arg Val Val Val Phe Asp Arg Asn Leu 
175 180 185 190 

aat act gac cag ttc age tct ggt get caa gee cca aat get caa agg 624 
Asn Thr Asp Gin Phe Ser Ser Gly Ala Gin Ala Pro Asn Ala Gin Arg 
195 200 205 

cga act cca gac teg acc ttc tea gag acc ttc aag gaa ggc gtt cag 672 
Arg Thr Pro Asp Ser Thr Phe Ser Glu Thr Phe Lys Glu Gly Val Gin 
210 215 220 

gag gtt ttc ttc ccc teg gat etc agt ctg egg atg cct ggc atg aat 72 0 
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Glu Val Phe Phe Pro Ser Asp Leu Ser Leu Arg Met Pro Gly Met Asn 
225 230 235 

tea gag gac tat gtt ttt gac agt gtt tct ggg aac aac ttt gaa tat 
Ser Glu Asp Tyr Val Phe Asp Ser Val Ser Gly Asn Asn Phe Glu Tyr 
240 245 250 



atg acg tac ctg aac aaa ggc cag ttc tat ccc ate acc ttg aag gag 
Met Thr Tyr Leu Asn Lys Gly Gin Phe Tyr Pro lie Thr Leu Lys Glu 
275 280 285 



gtg ate atg gtg gtt ttt get gaa gac aaa age aga gaa gat cag tta 
Val lie Met Val Val Phe Ala Glu Asp Lys Ser Arg Glu Asp Gin Leu 
305 310 315 

agg cat tgg aag tac tgg cac tec egg cag cac acc get aaa caa aga 
Arg His Trp Lys Tyr Trp His Ser- Arg Gin His Thr Ala Lys Gin Arg 
320 325 330 

tgc att gac ata get gac tat awa gaa age ttc aac act ate agt aac 
Cys lie Asp He Ala Asp Tyr Xaa Glu Ser Phe Asn Thr He Ser Asn 
335 340 345 350 

ate gag gag att gcg tat aac gee att tec ttc aca tgg gac ate aac 
He Glu Glu He Ala Tyr Asn Ala He Ser Phe Thr Trp Asp He Asn 
355 360 365 

gat gaa gca aag gtt ttc ate tct gtg aac tgc tta age aca gat ttc 
Asp Glu Ala Lys Val Phe He Ser Val Asn Cys Leu Ser Thr Asp Phe 
370 375 380 



768 



acc eta gaa get tea aaa tea ctt cga cag aag cca gga gac agt acc 816 
Thr Leu Glu Ala Ser Lys Ser Leu Arg Gin Lys Pro Gly Asp Ser Thr 
255 260 265 270 



864 



gtg age age agt gaa gga ate cat cat ccc ate age aaa gtt cga agt 912 
Val Ser Ser Ser Glu Gly He His His Pro He Ser Lys Val Arg Ser 
290 295 300 



960 



1008 



1056 



1104 



1152 



tct tec cag aag gga gtg aag ggg ttg cct ctt aac att caa gtt gat 12 00 

Ser Ser Gin Lys Gly Val Lys Gly Leu Pro Leu Asn He Gin Val Asp 
385 390 395 

acc tat agt tac aac aac cgc age aac aag cct gtg cac egg gee tac 124 8 

Thr Tyr Ser Tyr Asn Asn Arg Ser Asn Lys Pro Val His Arg Ala Tyr 
400 405 410 

tgc cag ate aag gtc ttc tgt gac aag gga get gag egg aaa ate agg 12 96 

Cys Gin He Lys Val Phe Cys Asp Lys Gly Ala Glu Arg Lys He Arg 
415 " 420 425 430 

gat gaa gaa cga aag caa age aaa aga aaa gtt tct gat gtt aaa gtg 1344 
Asp Glu Glu Arg Lys Gin Ser Lys Arg Lys Val Ser Asp Val Lys Val 
435 440 445 

cca ctg ctt ccc tct cac aag cga atg gat ate aca gtt ttc aaa ccc 1392 
Pro Leu Leu Pro Ser His Lys Arg Met Asp He Thr Val Phe Lys Pro 
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ttc att gat etc gat act cag cct gtc etc ttc att cct gac gtg cac 1440 
Phe lie Asp Leu Asp Thr Gin Pro Val Leu Phe lie Pro Asp Val His 
465 470 475 

ttt gec aac ttg cag egg ggc act cat gtc ctt ccc att gec tct gaa 1488 
Phe Ala Asn Leu Gin Arg Gly Thr His Val Leu Pro lie Ala Ser Glu 
480 . 485 490 

gaa ttg gag ggt gaa ggc tct gtc ttg aaa agg ggg ccg tac ggc aca 1536 
Glu Leu Glu Gly Glu Gly Ser Val Leu Lys Arg Gly Pro Tyr Gly Thr 
495 500 505 510 

gaa gat gac ttt get gtc cct cct tct ace aag ctg gee egg ata gaa 1584 
Glu Asp Asp Phe Ala Val Pro Pro Ser Thr Lys Leu Ala Arg lie Glu 
515 520 525 

gaa cca aag aga gtg ctg etc tac gtt cga aag gag tea gaa gaa gtc 1632 
Glu Pro Lys Arg Val Leu Leu Tyr Val Arg Lys Glu Ser Glu Glu Val 
530 535 540 

ttt gat gec ctg atg etc aaa acc cca tct ttg aag ggc ttg atg gaa 1680 
Phe Asp Ala Leu Met Leu Lys Thr Pro Ser Leu Lys Gly Leu Met Glu 
545 550 555 

get ate tea gac aaa tac gat gtt ccc cat gac aag att ggg aaa ata 1728 
Ala lie Ser Asp Lys Tyr Asp Val Pro His Asp Lys lie Gly Lys lie 
560 565 570 

ttc aag aag tgt aaa aag ggg ate ctg gtg aac atg gac gac aac att 1776 
Phe Lys Lys Cys Lys Lys Gly lie .Leu Val Asn Met Asp Asp Asn lie 
575 580 585 590 

gtg aag cat tac tec aat gag gac acc ttc cag ctg cag att gaa gaa 1824 
Val Lys His Tyr Ser Asn Glu Asp Thr Phe Gin Leu Gin lie Glu Glu 
595 1 600 605 

gec ggg ggg tct tac aag etc acc ctg acg gag ate taaaggcctg 1870 
Ala Gly Gly Ser Tyr Lys Leu Thr Leu Thr Glu lie 
610 615 



cgggccacag 


ctccccagga 


gttcagtgca 


ggtgtttcta 


gatcttaegg 


tttggcaact 


1930 


gcaggtaacc 


ccagtcagcc 


atgtcgccag 


cacaggtcta 


tgtcgaggga 


atgggttcct 


1990 


tgcaggttgg 


aggegggget 


gcatctggct 


tggtggtagc 


atttaatcta 


ttgcattggt 


2050 


gtttttcaga 


tgaaagagaa 


atccatatac 


cattatgttt 


gaatttcctg 


atatatacag 


2110 


gatttaaagt 


gaaaacttta 


ttccaagagt 


taacagagtc 


tctgggaagc 


tttaggacat 


2170 


ctgctacgtt 


atttatcaaa 


atattgggat 


ctctgccttg 


tgcctacagt 


gtcgtgggcc 


2230 


tgetegctag 


cagaagtcag 


aaaaggegat 


aggcttggct 


ttaaggattt 


cgtgcccttg 


2290 


cctgaattca 


gtacaactcc 


actgcctcac 


gttagcggga 


gcgcacctga 


agagtaeggg 


2350 
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gggagccctc t 23 61 



<210> 


4 


<211> 


618 


<212> 


PRT 


<213> 


human 


<220> 




<221> 


miscj 


<222> 


(342) 


<223> 


The % : 


<400> 


4 



Met Thr Gin Glu Tyr Asp Asn Lys Arg Pro Val Leu Val Leu Gin Asn 
15 10 15 



Glu Ala Leu Tyr Pro Gin Arg Arg Ser Tyr Thr Ser Glu Asp Glu Ala 
20 25 30 



Trp Lys Ser Phe Leu Glu Asn Pro Leu Thr Ala Ala Thr Lys Ala Met 
35 40 45 



Met Ser lie Asn Gly Asp Glu Asp Ser Ala Ala Ala Leu Gly Leu Leu 
50 55 60 



Tyr Asp Tyr Tyr Lys Val Pro Arg Glu Arg Arg Ser Ser Thr Ala Lys 
65 70 75 80 



Pro Glu Val Glu His Pro Glu Pro Asp His Ser Lys Arg Asn Ser lie 
85 90 95 



Pro lie Val Thr Glu Gin Pro Leu lie Ser Ala Gly Glu Asn Arg Val 
100 105 110 



Gin Val Leu Lys Asn Val Pro Phe Asn lie Val Leu Pro His Gly Asn 
115 120 125 



Gin Leu Gly lie Asp Lys Arg Gly His Leu Thr Ala Ser Asp Thr Thr 
130 ~ 135 140 



Val Thr Val Ser lie Ala Thr Met Pro Thr His Ser He Lys Thr Glu 
145 150 155 160 



Thr Gin Pro His Gly Phe Ala Val Gly He Pro Pro Ala Val Tyr His 
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165 170 175 



Pro Glu Pro Thr Glu Arg Val Val Val Phe Asp Arg Asn Leu Asn Thr 
180 185 190 



Asp Gin Phe Ser Ser Gly Ala Gin Ala Pro Asn Ala Gin Arg Arg Thr 
195 200 205 



Pro Asp Ser Thr Phe Ser Glu Thr Phe Lys Glu Gly Val Gin Glu Val 
210 215 220 



Phe Phe Pro Ser Asp Leu Ser Leu Arg Met Pro . Gly Met Asn Ser Glu 
225 230 235 240 



Asp Tyr Val Phe Asp Ser Val Ser Gly Asn Asn Phe Glu Tyr Thr Leu 
245 250 255 



Glu Ala Ser Lys Ser Leu Arg Gin Lys Pro Gly Asp Ser Thr Met Thr 
260 265 270 



Tyr Leu Asn Lys Gly Gin Phe Tyr Pro He Thr Leu Lys Glu Val Ser 
275 280 285 



Ser Ser Glu Gly He His His Pro He Ser Lys Val Arg Ser Val He 
290 295 300 



Met Val Val Phe Ala Glu Asp Lys Ser Arg Glu Asp Gin Leu Arg His 
305 310 315 320 



Trp Lys Tyr Trp His Ser Arg Gin His Thr Ala Lys Gin Arg Cys He 
325 330 335 



Asp He Ala Asp Tyr Xaa Glu Ser Phe Asn Thr He Ser Asn He Glu 
340 345 350 



Glu He Ala Tyr Asn Ala He Ser Phe Thr Trp Asp He Asn Asp Glu 
355 360 365 



Ala Lys Val Phe He Ser Val Asn Cys Leu Ser Thr Asp Phe Ser Ser 
370 375 380 



Gin Lys Gly Val Lys Gly Leu Pro Leu Asn He Gin Val Asp Thr Tyr 
385 390 395 400 
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Ser Tyr Asn Asn Arg Ser Asn Lys Pro Val His Arg Ala Tyr Cys Gin 
405 410 415 



lie Lys Val Phe Cys Asp Lys Gly Ala Glu Arg Lys lie Arg Asp Glu 
420 425 430 



Glu Arg Lys Gin Ser Lys Arg Lys Val Ser Asp Val Lys Val Pro Leu 
435 440 445 



Leu Pro Ser His Lys Arg Met Asp lie Thr Val Phe Lys Pro Phe lie 
450 455 460 



Asp Leu Asp Thr Gin Pro Val Leu Phe lie Pro Asp Val His Phe Ala 
465 470 475 480 



Asn Leu Gin Arg Gly Thr His Val Leu Pro He Ala Ser Glu Glu Leu 
485 490 495 



Glu Gly Glu Gly Ser Val Leu Lys Arg Gly Pro Tyr Gly Thr Glu Asp 
500 505 510 



Asp Phe Ala Val Pro Pro Ser Thr Lys Leu Ala Arg He Glu Glu Pro 
515 520 525 



Lys Arg Val Leu Leu Tyr Val Arg Lys Glu Ser Glu Glu Val Phe Asp 
530 535 540 



Ala Leu Met Leu Lys Thr Pro Ser Leu Lys Gly Leu Met Glu Ala He 
545 * 550 555 560 



Ser Asp Lys Tyr Asp Val Pro His Asp Lys He Gly Lys He Phe Lys 
565 570 575 



Lys Cys Lys Lys Gly He Leu Val Asn Met Asp Asp Asn He Val Lys 
580 585 590 



His Tyr Ser Asn Glu Asp Thr Phe Gin Leu Gin He Glu Glu Ala Gly 
595 600 605 



Gly Ser Tyr Lys Leu Thr Leu Thr Glu He 
610 615 
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<210> 5 

<211> 4532 

<212> DNA 

<213> human 

<220> 

<221> CDS 

<222> (67) . . (1941) 

<223> 



<400> 5 

ttgaaagtcc agtttcacca gaggctgagg ctccaggaaa aggggagcaa gttcattgga 60 

tcaaac atg tea caa gag tea gac aat aat aaa aga eta gtg gcc tta 108 
Met Ser Gin Glu Ser Asp Asn Asn Lys Arg Leu Val Ala Leu 
1 5 10 

gtg ccc atg ccc agt gac cct cca ttc aat acc cga aga gcc tac acc 156 
Val Pro Met Pro Ser Asp Pro Pro Phe Asn Thr Arg Arg Ala Tyr Thr 
15 20 25 30 

agt gag gat gaa gcc tgg aag tea tac ttg gag aat ccc ctg aca gca 2 04 

Ser Glu Asp Glu Ala Trp Lys Ser Tyr Leu Glu Asn Pro Leu Thr Ala 
35 40 45 

gcc acc aag gcc atg atg age att aat ggt gat gag gac agt get get 252 
Ala Thr Lys Ala Met Met Ser He Asn Gly Asp Glu Asp Ser Ala Ala 
50 55 60 

gcc etc ggc ctg etc tat gac tac tac aag gtt cct cga gac aag agg 300 
Ala Leu Gly Leu Leu Tyr Asp Tyr Tyr Lys Val Pro Arg Asp Lys Arg 
65 70 75 

ctg ctg tct gta age aaa gca agt gac age caa gaa gac cag gag aaa 34 8 

Leu Leu Ser Val Ser Lys Ala Ser Asp Ser Gin Glu Asp Gin Glu Lys 
80 85 90 

aga aac tgc ctt ggc acc agt gaa gcc cag agt aat ttg agt gga gga 3 96 

Arg Asn Cys Leu Gly Thr Ser Glu Ala Gin Ser Asn Leu Ser Gly Gly 
95 "* 100 105 110 

gaa aac cga gtg caa gtc eta aag act gtt cca gtg aac ctt tec eta 444 
Glu Asn Arg Val Gin Val Leu Lys Thr Val Pro Val Asn Leu Ser Leu 
115 120 125 

aat caa gat cac ctg gag aat tec aag egg gaa cag tac age ate age 4 92 

Asn Gin Asp His Leu Glu Asn Ser Lys Arg Glu Gin Tyr Ser He Ser 
130 135 140 

ttc ccc gag age tct gcc ate ate ccg gtg teg gga ate acg gtg gtg 54 0 

Phe Pro Glu Ser Ser Ala He He Pro Val Ser Gly He Thr Val Val 
145 150 155 

aaa get gaa gat ttc aca cca gtt ttc atg gcc cca cct gtg cac tat 588 
Lys Ala Glu Asp Phe Thr Pro Val Phe Met Ala Pro Pro Val His Tyr 
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160 165 170 

ccc egg gga gat ggg gaa gag caa cga gtg gtt ate ttt gaa cag act 63 6 

Pro Arg Gly Asp Gly Glu Glu Gin Arg Val Val lie Phe Glu Gin Thr 
175 180 185 190 

cag tat gac gtg ccc teg ctg gec acc cac age gec tat etc aaa gac 684 
Gin Tyr Asp Val Pro Ser Leu Ala Thr His Ser Ala Tyr Leu Lys Asp 
195 200 205 

gac cag cgc age act ccg gac age aca tac age gag age ttc aag gac 732 
Asp Gin Arg Ser Thr Pro Asp Ser Thr Tyr Ser Glu Ser Phe Lys Asp 
210 215 220 

gca gec aca gag aaa ttt egg agt get tea gtt ggg get gag gag tac 780 
Ala Ala Thr Glu Lys Phe Arg Ser Ala Ser Val Gly Ala Glu Glu Tyr 
225 230 235 

atg tat gat cag aca tea agt ggc aca ttt cag tac acc ctg gaa gec 82 8 

Met Tyr Asp Gin Thr Ser Ser Gly Thr Phe Gin Tyr Thr Leu Glu Ala 
240 245 250 

acc aaa tct etc cgt cag aag cag ggg gag ggc ccc atg acc tac etc 876 
Thr Lys Ser Leu Arg Gin Lys Gin Gly Glu Gly Pro Met Thr Tyr Leu 
255 260 265 270 

aac aaa gga cag ttc tat gec ata aca etc age gag acc gga gac aac 924 
Asn Lys Gly Gin Phe Tyr Ala lie Thr Leu Ser Glu Thr Gly Asp Asn 
275 280 285 

aaa tgc ttc cga cac ccc ate age aaa gtc agg agt gtg gtg atg gtg 972 
Lys Cys Phe Arg His Pro lie Ser Lys Val Arg Ser Val Val Met Val 
290 295 300 

gtc ttc agt gaa gac aaa aac aga gat gaa cag etc aaa tac tgg aaa 102 0 

Val Phe Ser Glu Asp Lys Asn Arg Asp Glu Gin Leu Lys Tyr Trp Lys 
305 310 315 

tac tgg cac tct egg cag cat acg gcg aag cag agg gtc ctt gac att 1068 
Tyr Trp His Ser Arg Gin His Thr Ala Lys Gin Arg Val Leu Asp lie 
320 325 330 

gee gat tac aag gag age ttt aat acg att gga aac att gaa gag att 1116 
Ala Asp Tyr Lys Glu Ser Phe Asn Thr lie Gly Asn lie Glu Glu lie 
335 340 345 350 

gca tat aat get gtt tec ttt acc tgg gac gtg aat gaa gag gcg aag 1164 
Ala Tyr Asn Ala Val Ser Phe Thr Trp Asp Val Asn Glu Glu Ala Lys 
355 360 365 

att ttc ate acc gtg aat tgc ttg age aca gat ttc tec tec caa aaa 1212 
lie Phe lie Thr Val Asn Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys 
370 375 380 

ggg gtg aaa gga ctt cct ttg atg att cag att gac aca tac agt tat 1260 
Gly Val Lys Gly Leu Pro Leu Met lie Gin lie Asp Thr Tyr Ser Tyr 
385 390 395 
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aac aat cgt age aat aaa ccc att cat aga get tat tgc cag ate aag 13 08 

Asn Asn Arg Ser Asn Lys Pro lie His Arg Ala Tyr Cys Gin lie Lys 
400 405 410 

gtc ttc tgt gac aaa gga gca gaa aga aaa ate cga gat gaa gag egg 1356 
Val Phe Cys Asp Lys Gly Ala Glu Arg Lys lie Arg Asp Glu Glu Arg 
415 420 425 430 

aag cag aac agg aag aaa ggg aaa ggc cag gee tec caa act caa tgc 1404 
Lys Gin Asn Arg Lys Lys Gly Lys Gly Gin Ala Ser Gin Thr Gin Cys 
435 440 445 

aac age tec tct gat ggg aag ttg get gee ata cct tta cag aag aag 1452 
Asn Ser Ser Ser Asp Gly Lys Leu Ala Ala He Pro Leu Gin Lys Lys 
450 455 460 

agt gac ate acc tac ttc aaa ace atg cct gat etc cac tea cag cca 15 00 

Ser Asp He Thr Tyr Phe Lys Thr Met Pro Asp Leu His Ser Gin Pro 
465 470 475 

gtt etc ttc ata cct gat gtt cac ttt gca aac ctg cag agg acc gga 154 8 

Val Leu Phe He Pro Asp Val His Phe Ala Asn Leu Gin Arg Thr Gly 
480 485 490 

cag gtg tat tac aac acg gat gat gaa cga gaa ggt ggc agt gtc ctt 1596 
Gin Val Tyr Tyr Asn Thr Asp Asp Glu Arg Glu Gly Gly Ser Val Leu 
495 500 505 510 

gtt aaa egg atg ttc egg ccc atg gaa gag gag ttt ggt cca gtg cct 1644 
Val Lys Arg Met Phe Arg Pro Met Glu Glu Glu Phe Gly Pro Val Pro 
515 520 525 

tea aag cag atg aaa gaa gaa ggg aca aag cga gtg etc ttg tac gtg 1692 
Ser Lys Gin Met Lys Glu Glu Gly Thr Lys Arg Val Leu Leu Tyr Val 
530 535 540 

agg aag gag act gac gat gtg ttc gat gca ttg atg ttg aag tct ccc 1740 
Arg Lys Glu Thr Asp Asp Val Phe Asp Ala Leu Met Leu Lys Ser Pro 
545 550 555 

aca gtg aag ggc ctg atg gaa gcg ata tct gag aaa tat ggg ctg ccc 1788 
Thr Val Lys Gly Leu Met Glu Ala He Ser Glu Lys Tyr Gly Leu Pro 
560 565 570 

gtg gag aag ata gca aag ctt tac aag aaa age aaa aaa ggc ate ttg 183 6 

Val Glu Lys He Ala Lys Leu Tyr Lys Lys Ser Lys Lys Gly He Leu 
575 580 585 590 

gtg aac atg gat gac aac ate ate gag cac tac teg aac gag gac acc 1884 
Val Asn Met Asp Asp Asn He He Glu His Tyr Ser Asn Glu Asp Thr 
595 600 605 

ttc ate etc aac atg gag age atg gtg gag ggc ttc aag gtc acg etc 1932 
Phe He Leu Asn Met Glu Ser Met Val Glu Gly Phe Lys Val Thr Leu 
610 615 620 
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atg gaa ate tagccctggg tttggcatcc gctttggctg gagctctcag 1981 
Met Glu lie 
625 

tgcgttcctc cctgagagag acagaagccc cagccccaga acctggagac. ccatctcccc 2041 

catctcacaa ctgctgttac aagaccgtgc tggggagtgg ggcaagggac aggccccact 2101 

gtcggtgtgc ttggcccatc cactggcacc taccaeggag ctgaagcctg agcccctcag 2161 

gaaggtgcct taggcctgtt ggattcctat ttattgecca ccttttcctg gageccaggt 2221 

ccaggcccgc caggactctg caggtcactg ctagctccag atgagaccgt ccagcgttcc 2281 

cccttcaaga gaaacactca tcccgaacag cctaaaaaat tcccatccct tctctctcac 2341 

ccctccatat ctatctcccg agtggctgga caaaatgagc tacgtctggg tgcagtagtt 2401 

ataggtgggg caagaggtgg atgcccactt tctggtcaga cacctttagg ttgctctggg 2461 

gaaggctgtc ttgetaaata cctccagggt tcccagcaag tggccaccag gecttgtaca 2521 

ggaagacatt cagtcaccgt gtaattagta acacagaaag tctgcctgtc tgcattgtac 2581 

atagtgttta taatattgta ataatatatt ttacctgtgg tatgtgggca tgtttactgc 2 641 

cactggcctt agaggagaca cagacctgga gaeegtttta atgggggttt ttgcctctgt 2701 

gcctgttcaa gagacttgea gggctaggta gagggecttt gggatgttaa ggtgactgca 2761 

getgatgeca agatggactc tgcaatgggc atacctgggg gctcgttccc tgtccccaga 2 821 

ggaagccccc tctccttctc catgggcatg actctccttc gaggccacca cgtttatctc 2 881 

acaatgatgt gttttgcttg actttccctt tgcgctgtct cgtgggaaag gtcattctgt 2 941 

ctgagacccc agctccttct ccagctttgg ctgegggcat ggectgaget ttctggagag 3 001 

cctctgcagg gggtttgcca tcagggccct gtggctgggt ctgetgeaga gctccttggc 3 061 

tatcaggaga atcctggaca ctgtactgtg cctcccagtt tacaaacacg cccttcatct 3121 

caagtggccc tttaaaaggc ctgctgccat gtgagagctg tgaacagctc agctctgagt 3181 

eggcaggctg gggcttcctc ctgggccacc agatggaaag ggggtattgt ttgcctcact 3241 

ectggatget gcgttttaag gaagtgagtg agaaagaatg tgecaagata cctggctcct 33 01 

gtgaaaccag cctcaggagg gaaactggga gagagaagct gtggtctcct gctacatgcc 3361 

ctgggagctg gaagagaaaa acactcccct aaacaatege aaaatgatga accatcatgg 3421 

gccactgttc tctttgaggg gacaggttta ggggtttgcg ttcgcccttg tgggctgaag 3481 

cactagcttt ttggtagcta gacacatcct gcacccaaag gttctctaca aaggeccaga 3541 

tttgtttgta aagcactttg actcttacct ggaggcccgc tctctaaggg cttcctgcgc 3601 
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irrracctca 

V-* C4. w w t-A. 


fcc tcrtcccta 


agatgeagag 


c acr cr a t crcr aQ 


QQtctQCttc 


tagefceaget 


3661 


ahhtctcctt 


y ay y l. i-yuyy 


aggaattgaa 


ttgaatggga 


cagagggcag 


QtqctcrfccrQC 

3 ^-"3 3 3 3 


3721 


r*s» a era a era i~ 


n-uy ay v-. ay %-» 


agtgacgggg 


caccttgctg 


tgtgtcctct 


acrcrcatcrtta 


3781 




y y y y uv-aaa^j 


gtttgeateg 


tggatccagc 


tgtgctccag 


fcctgtcccct 


3841 


V»- U L» WU L.^<V-Cl^ 


1 - c t" cr a f t* err 1 c 


acgccccgga 


ccagcagctt 


ggggaccctc 


cacrcrqtacta 


3901 


a u.y y y y u l-u. i— 


y c u uy y ^ 


ggacaaattc 


agtgttggaa 


atacatgttg 


tactatgeae 


3961 


tin- ^auyu t_ 


uul »yy y l. l. a 


ggaatagttt 


caaacatgat 


tggcagacat 


aacaaeggea 


4021 


aciLa^ L-oyyct 


u. y y y y v— a c a 


ggactccaga gtaggaaaaa gacaaaagat 


ttggcagcct 


4081 


nana rrrrr^a 


acctacccch 

a w w a v» w v— ^ 


ctctctccag 


cctctttatg 


aaactgtttg 


tttgccagtc 


4141 


ctgccctaag 


gcagaagatg 


aattgaagat 


gctgtgcatg 


tttcctaagt 


ccttgagcaa 


4201 


tcatggtggt 


gaeaattgee 


acaagggata 


tgaggccagt 


gccaccagag 


ggtggtgcca 


4261 


agtgccacat 


cccttccgat 


ccattcccct 


ctgcatcctc 


ggagcacccc 


agtttgcctt 


4321 


tgatgtgtcc 


gctgtgtatg 


ttagctgaac 


tttgatgagc 


aaaatttcct 


gagegaaaca 


4381 


ctccaaagag 


ataggaaaac 


ttgccgcctc 


ttcttttttg 


tcccttaatc 


aaactcaaat 


4441 


aagcttaaaa 


aaaatccatg 


gaagatcatg 


gacatgtgaa 


atgagcattt 


ttttcttttt 


4501 


tttttttttt 


tttaacaaag 


tctgaactga g 






4532 



<210> 6 

<211> 625 

<212> PRT 

<213> human 

<400> 6 

Met Ser Gin Glu Ser Asp Asn Asn Lys Arg Leu Val Ala Leu Val Pro 
15 10 15 

Met Pro Ser Asp Pro Pro Phe Asn Thr Arg Arg Ala Tyr Thr Ser Glu 
20 25 30 

Asp Glu Ala Trp Lys Ser Tyr Leu Glu Asn Pro Leu Thr Ala Ala Thr 
35 40 45 



Lys Ala Met Met Ser He Asn Gly Asp Glu Asp Ser Ala Ala Ala Leu 
50 55 60 
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Gly Leu Leu Tyr Asp Tyr Tyr Lys Val Pro Arg Asp Lys Arg Leu Leu 
65 70 75 80 



Ser Val Ser Lys Ala Ser Asp Ser Gin Glu Asp Gin Glu Lys Arg Asn 
85 90 95 



Cys Leu Gly Thr Ser Glu Ala Gin Ser Asn Leu Ser Gly Gly Glu Asn 
100 105 110 



Arg Val Gin Val Leu Lys Thr Val Pro Val Asn Leu Ser Leu Asn Gin 
115 120 125 



Asp His Leu Glu Asn Ser Lys Arg Glu Gin Tyr Ser He Ser Phe Pro 
13 0 135 140 



Glu Ser Ser Ala He He Pro Val Ser Gly He Thr Val Val Lys Ala 
145 150 155 160 



Glu Asp Phe Thr Pro Val Phe Met Ala Pro Pro Val His Tyr Pro Arg 
165 170 175 



Gly Asp Gly Glu Glu Gin Arg Val Val He Phe Glu Gin Thr Gin Tyr 
180 185 190 



Asp Val Pro Ser Leu Ala Thr His Ser Ala Tyr Leu Lys Asp Asp Gin 
195 200 205 



Arg Ser Thr Pro Asp Ser Thr Tyr Ser Glu Ser Phe Lys Asp Ala Ala 
210 215 220 



Thr Glu Lys Phe Arg Ser Ala Ser Val Gly Ala Glu Glu Tyr Met Tyr 
225 230 235 240 



Asp Gin Thr Ser Ser Gly Thr Phe Gin Tyr Thr Leu Glu Ala Thr Lys 
245 250 255 



Ser Leu Arg Gin Lys Gin Gly Glu Gly Pro Met Thr Tyr Leu Asn Lys 
260 265 270 



Gly Gin Phe Tyr Ala He Thr Leu Ser Glu Thr Gly Asp Asn Lys Cys 
275 280 285 



Phe Arg His Pro He Ser Lys Val Arg Ser Val Val Met Val Val Phe 
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290 295 300 



Ser Glu Asp Lys Asn Arg Asp Glu Gin Leu Lys Tyr Trp Lys Tyr Trp 

305 310 315 320 

His Ser Arg Gin His Thr Ala Lys Gin Arg Val Leu Asp lie Ala Asp 

325 330 335 



Tyr Lys Glu Ser Phe Asn Thr lie Gly Asn lie Glu Glu He Ala Tyr 
340 345 350 



Asn Ala Val Ser Phe Thr Trp Asp Val Asn Glu Glu Ala Lys He Phe 
355 360 365 



He Thr Val Asn Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly Val 
370 375 380 



Lys Gly Leu Pro Leu Met lie Gin lie Asp Thr Tyr Ser Tyr Asn Asn 
385 390 395 " 400 



Arg Ser Asn Lys Pro He His Arg Ala Tyr Cys Gin He Lys Val Phe 
405 410 " 415 



Cys Asp Lys Gly Ala Glu Arg Lys He Arg Asp* Glu Glu Arg Lys Gin 
420 425 430 



Asn Arg Lys Lys Gly Lys Gly Gin Ala Ser Gin Thr Gin Cys Asn Ser 
435 440 445 



Ser Ser Asp Gly Lys Leu Ala Ala He Pro Leu Gin Lys Lys Ser Asp 
450 455 460 



He Thr Tyr Phe Lys Thr Met Pro Asp Leu His Ser Gin Pro Val Leu 
465 470 475 480 



Phe He Pro Asp Val His Phe Ala Asn Leu Gin Arg Thr Gly Gin Val 
485 490 495 



Tyr Tyr Asn Thr Asp Asp Glu Arg Glu Gly Gly Ser Val Leu Val Lys 
500 505 510 



Arg Met Phe Arg Pro 
515 



Met 



Glu Glu Glu Phe Gly Pro Val Pro 
520 ~ 525 



Ser Lys 
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Gin Met Lys Glu Glu Gly Thr Lys Arg Val Leu Leu Tyr Val Arg Lys 
530 535 540 



Glu Thr Asp Asp Val Phe Asp Ala Leu Met Leu Lys Ser Pro Thr Val 
545 550 555 560 



Lys Gly Leu Met Glu Ala He Ser Glu Lys Tyr Gly Leu Pro Val Glu 
565 570 575 



Lys He Ala Lys Leu Tyr Lys Lys Ser Lys Lys Gly He Leu Val Asn 
580 585 590 



Met Asp Asp Asn He He Glu His Tyr Ser Asn Glu Asp Thr Phe He 
595 600 605 



Leu Asn Met Glu Ser Met Val Glu Gly Phe Lys Val Thr Leu Met Glu 
610 615 620 



He 
625 



<210> 7 

<211> 1870 

<212> DNA 

<213> HUMAN 

<220> 

<221> CDS 

<222> (47) . . (1867) 

<223> 



<400> 7 

aggagatgtg ccaaactgtt aagagtggtt atttctgagc agaaga atg tgg atg 55 

Met Trp Met 
1 

aat tec att ctt cct att ttt ctt ttc agg tct gtg egg ctg eta aag 103 
Asn Ser He Leu Pro He Phe Leu Phe Arg Ser Val Arg Leu Leu Lys 
5 10 15 

aac gac cca gtc aac ttg cag aaa ttc tct tac act agt gag gat gag 151 
Asn Asp Pro Val Asn Leu Gin Lys Phe Ser Tyr Thr Ser Glu Asp Glu 
20 25 30 35 



gec tgg aag acg tac eta gaa aac ccg ttg aca get gee aca aag gec 199 
Ala Trp Lys Thr Tyr Leu Glu Asn Pro Leu Thr Ala Ala Thr Lys Ala 
40 45 50 
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atg atg aga gtc aat gga gat gat gac agt gtt gcg gcc ttg age ttc 24 7 

Met Met Arg Val Asn Gly Asp Asp Asp Ser Val Ala Ala Leu Ser Phe 
55 60 65 

etc tat gat tac tac atg ggt ccc aag gag aag egg ata ttg tec tec 295 
Leu Tyr Asp Tyr Tyr Met Gly Pro Lys Glu Lys Arg lie Leu Ser Ser 
70 75 80 

age act ggg ggc agg aat gac caa gga aag agg tac tac cat ggc atg 343 
Ser Thr Gly Gly Arg Asn Asp Gin Gly Lys Arg Tyr Tyr His Gly Met 
85 90 95 

gaa tat gag acg gac etc act ccc ctt gaa age ccc aca cac etc atg 391 
Glu Tyr Glu Thr Asp Leu Thr Pro Leu Glu Ser Pro Thr His Leu Met 
100 105 110 115 

aaa ytc ctg aca gag aac gtg tct gga ace cca gag tac cca gat ttg 43 9 

Lys Xaa Leu Thr Glu Asn Val Ser Gly Thr Pro Glu Tyr Pro Asp Leu 
120 125 130 

etc aag aag aat aac ctg atg age ttg gag ggg gcc ttg ccc ace cct 487 
Leu Lys Lys Asn Asn Leu Met Ser Leu Glu Gly Ala Leu Pro Thr Pro 
135 140 145 

ggc aag gca get ccc etc cct gca ggc ccc age aag ctg gag gcc ggc 535 
Gly Lys Ala Ala Pro Leu Pro Ala Gly Pro Ser Lys Leu Glu Ala Gly 
150 155 160 

tct gtg gac age tac ctg tta ccc acy act gat atg tat gat -aat ggc 583 
Ser Val Asp Ser Tyr Leu Leu Pro Xaa Thr Asp Met Tyr Asp Asn Gly 
165 170 175 

tec etc aac tec ttg ttt gag age att cat ggg gtg ccg ccc aca cag 631 
Ser Leu Asn Ser Leu Phe Glu Ser lie His Gly Val Pro Pro Thr Gin 
180 185 190 195 

cgc tgg cag cca gac age acc ttc aaa gat gac cca cag gag teg atg 67 9 

Arg Trp Gin Pro Asp Ser Thr Phe Lys Asp Asp Pro Gin Glu Ser Met 
200 205 210 

etc ttc cca gat ate ctg aaa acc tec ccg gaa ccc cca tgt cca gag 727 
Leu Phe Pro Asp lie Leu Lys Thr Ser Pro Glu Pro Pro Cys Pro Glu 
215 220 225 

gac tac ccc age etc aaa agt gac ttt gaa tac acc ctg ggc tec ccc 775 
Asp Tyr Pro Ser Leu Lys Ser Asp Phe Glu Tyr Thr Leu Gly Ser Pro 
230 235 240 

aaa gcc ate cac ate aag tea ggc gag tea ccc atg gcc tac etc aac 82 3 

Lys Ala He His He Lys Ser Gly Glu Ser Pro Met Ala Tyr Leu Asn 
245 250 255 

aaa ggc cag ttc tac ccc gtc acc ctg egg acc cca gca ggt ggc aaa 871 
Lys Gly Gin Phe Tyr Pro Val Thr Leu Arg Thr Pro Ala Gly Gly Lys 
260 265 270 275 
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ggc ctt gcc ttg tec tec aac aaa gtc aag agt gtg gtg atg gtt gtc 919 
Gly Leu Ala Leu Ser Ser Asn Lys Val Lys Ser Val Val Met Val Val 
280 285 290 

ttc gac aat gag aag gtc cca gta gag cag ctg cgc ttc tgg aag cac 967 
Phe Asp Asn Glu Lys Val Pro Val Glu Gin Leu Arg Phe Trp Lys His 
295 300 305 

tgg cat tec egg caa ccc act gcc aag cag egg gtc att gac gtg get 1015 
Trp His Ser Arg Gin Pro Thr Ala Lys Gin Arg Val lie Asp Val Ala 
310 315 320 

gac tgc aaa gaa aac ttc aac act gtg gag cac att gag gag gtg gcc 1063 
Asp Cys Lys Glu Asn Phe Asn Thr Val Glu His He Glu Glu Val Ala 
325 330 335 

tat aat gca ctg tec ttt gtg tgg aac gtg aat gaa gag gcc aag gtg 1111 
Tyr Asn Ala Leu Ser Phe Val Trp Asn Val Asn Glu Glu Ala Lys Val 
340 345 350 355 

ttc ate ggc gta aac tgt ctg age aca gac ttt tec tea caa aag ggg 1159 
Phe He Gly Val Asn Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly 
360 365 370 

gtg aag ggt gtc ccc ctg aac ctg cag att gac ace tat gac tgt ggc 12 07 

Val Lys Gly Val Pro Leu Asn Leu Gin He Asp Thr Tyr Asp Cys Gly 
375 380 385 

ttg ggc act gag cgc ctg gta cac cgt get gtc tgc cag ate aag ate 1255 
Leu Gly Thr Glu Arg Leu Val His Arg Ala Val Cys Gin He Lys He 
390 395 400 

ttc tgt gac aag gga get gag agg aag atg cgc gat gac gag egg aag 1303 
Phe Cys Asp Lys Gly Ala Glu Arg Lys Met Arg Asp Asp Glu Arg Lys 
405 410 415 

cag ttc egg agg aag gtc aag tgc cct gac tec age aac agt ggc gtc 1351 
Gin Phe Arg Arg Lys Val Lys Cys Pro Asp Ser Ser Asn Ser Gly Val 
420 425 430 435 

aa 9 99^ tgc cfc 9 ct 9 tcg gg c ttc agg ggc aat gag acg acc tac ctt 1399 
Lys Gly Cys Leu Leu Ser Gly Phe Arg Gly Asn Glu Thr Thr Tyr Leu 
440 445 450 

egg cca gag act gac ctg gag acg cca ccc gtg ctg ttc ate ccc aat 1447 
Arg Pro Glu Thr Asp Leu Glu Thr Pro Pro Val Leu Phe He Pro Asn 
455 460 465 

gtg cac ttc tec age ctg cag cgc tct gga ggg gca gcc ccc tcg gca 1495 
Val His Phe Ser Ser Leu Gin Arg Ser Gly Gly Ala Ala Pro Ser Ala 
470 475 480 

gga ccc age age tec aac agg ctg cct ctg aag cgt acc tgc tcg ccc 1543 
Gly Pro Ser Ser Ser Asn Arg Leu Pro Leu Lys Arg Thr Cys Ser Pro 
485 490 495 

ttc act gag gag ttt gag cct ctg ccc tec aag cag gcc aag gaa ggc 15 91 
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Phe Thr Glu Glu Phe Glu Pro Leu Pro Ser Lys Gin Ala Lys Glu Gly 
500 505 510 515 

gac ctt cag aga gtt ctg ctg tat gtg egg agg gag act gag gag gtg 1639 
Asp Leu Gin Arg Val Leu Leu Tyr Val Arg Arg Glu Thr Glu Glu Val 
520 525 530 

ttt gac gcg etc atg ttg aag acc cca gac ctg aag ggg ctg agg aat 1687 
Phe Asp Ala Leu Met Leu Lys Thr Pro Asp Leu Lys Gly Leu Arg Asn 
535 540 545 

gcg ate tct gag aag tat ggg ttc cct gaa gag aac att tac aaa gtc 1735 
Ala lie Ser Glu Lys Tyr Gly Phe Pro Glu Glu Asn lie Tyr Lys Val 
550 555 560 

tac aag aaa tgc aag cga gga ate tta gtc aac atg gac aac aac ate 1783 
Tyr Lys Lys Cys Lys Arg Gly lie Leu Val Asn Met Asp Asn Asn lie 
565 570 575 

att cag cat tac age aac cac gtc gee ttc ctg ctg gac atg ggg gag 1831 
He Gin His Tyr Ser Asn His Val Ala Phe Leu Leu Asp Met Gly Glu 
580 585 590 595 

ctg gac ggc aaa att cag ate ate ctt aag gag ctg taa 1870 
Leu Asp Gly Lys He Gin He He Leu Lys Glu Leu 
600 605 

<210> 8 
<211> 607 
<212> PRT 
<213> HUMAN 

<220> 

<221> misc_f eature 
<222> (117) . . (117) 

<223> The 'Xaa' at location 117 stands for Leu, or Phe. 
<220> 

<221> misc_f eature 
<222> (172) . . (172) 

<223> The 'Xaa' at location 172 stands for Thr. 
<400> 8 

Met Trp Met Asn Ser He Leu Pro He Phe Leu Phe Arg Ser Val Arg 
15 10 15 

Leu Leu Lys Asn Asp Pro Val Asn Leu Gin Lys Phe Ser Tyr Thr Ser 
20 25 30 



Glu Asp Glu Ala Trp Lys Thr Tyr Leu Glu Asn Pro Leu Thr Ala Ala 
35 40 45 
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Thr Lys Ala Met Met Arg Val Asn Gly Asp Asp Asp Ser Val Ala Ala 
50 55 60 



Leu Ser Phe Leu Tyr Asp Tyr Tyr Met Gly Pro Lys Glu Lys Arg lie 
65 70 75 80 



Leu Ser Ser Ser Thr Gly Gly Arg Asn Asp Gin Gly Lys Arg Tyr Tyr 
85 90 95 



His Gly Met Glu Tyr Glu Thr Asp Leu Thr Pro Leu Glu Ser Pro Thr 
100 105 110 



His Leu Met Lys Xaa Leu Thr Glu Asn Val Ser Gly Thr Pro Glu Tyr 
115 120 125 



Pro Asp Leu Leu Lys Lys Asn Asn Leu Met Ser Leu Glu Gly Ala Leu 
130 135 140 



Pro Thr Pro Gly Lys Ala Ala Pro Leu Pro Ala Gly Pro Ser Lys Leu 
145 150 155 160 



Glu Ala Gly Ser Val Asp Ser Tyr Leu Leu Pro Xaa Thr Asp Met Tyr 
165 170 175 



Asp Asn Gly Ser Leu Asn Ser Leu Phe Glu Ser lie His Gly Val Pro 
180 185 190 



Pro Thr Gin Arg Trp Gin Pro Asp Ser Thr Phe Lys Asp Asp Pro Gin 
195 200 205 



Glu Ser Met Leu Phe Pro Asp He Leu Lys Thr Ser Pro Glu Pro Pro 
210 215 220 



Cys Pro Glu Asp Tyr Pro Ser Leu Lys Ser Asp Phe Glu Tyr Thr Leu 
225 230 235 240 



Gly Ser Pro Lys Ala He His He Lys Ser Gly Glu Ser Pro Met Ala 
245 250 255 



Tyr Leu Asn Lys Gly Gin Phe Tyr Pro Val Thr Leu Arg Thr Pro Ala 
260 265 270 



Gly Gly Lys Gly Leu Ala Leu Ser Ser Asn Lys Val Lys Ser Val Val 
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275 280 285 



Met Val Val Phe Asp Asn Glu Lys Val Pro Val Glu Gin Leu Arg Phe 
290 295 300 



Trp Lys His Trp His Ser Arg Gin Pro Thr Ala Lys Gin Arg Val lie 
305 310 315 320 



Asp Val Ala Asp Cys Lys Glu Asn Phe Asn Thr Val Glu His lie Glu 
325 330 335 



Glu Val Ala Tyr Asn Ala Leu Ser Phe Val Trp Asn Val Asn Glu Glu 
340 345 350 



Ala Lys Val Phe lie Gly Val Asn Cys Leu Ser Thr Asp Phe Ser Ser 
355 360 365 



Gin Lys Gly Val Lys Gly Val Pro Leu Asn Leu Gin lie Asp Thr Tyr 
370 375 380 



Asp Cys Gly Leu Gly Thr Glu Arg Leu Val His Arg Ala Val Cys Gin 
385 390 395 400 



He Lys lie Phe Cys Asp Lys Gly Ala Glu Arg Lys Met Arg Asp Asp 
405 410 415 



Glu Arg Lys Gin Phe Arg Arg Lys Val Lys Cys Pro Asp Ser Ser Asn 
420 425 430 



Ser Gly Val Lys Gly Cys Leu Leu Ser Gly Phe Arg Gly Asn Glu Thr 
435 440 445 



Thr Tyr Leu Arg Pro Glu Thr Asp Leu Glu Thr Pro Pro Val Leu Phe 
450 455 460 



He Pro Asn Val His Phe Ser Ser Leu Gin Arg Ser Gly Gly Ala Ala 
465 470 475 480 



Pro Ser Ala Gly Pro Ser Ser Ser Asn Arg Leu Pro Leu Lys Arg Thr 
485 490 495 



Cys Ser Pro Phe Thr Glu Glu Phe Glu Pro Leu Pro Ser Lys Gin Ala 
500 505 510 
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Lys Glu Gly Asp Leu Gin Arg Val Leu Leu Tyr Val Arg Arg Glu Thr 
515 520 525 



Glu Glu Val Phe Asp Ala Leu Met Leu Lys Thr Pro Asp Leu Lys Gly 
530 535 540 

Leu Arg Asn Ala He Ser Glu Lys Tyr Gly Phe Pro Glu Glu Asn He 
545 " 550 555 560 



Tyr Lys Val Tyr Lys Lys Cys Lys Arg Gly He Leu Val Asn Met Asp 
565 570 575 



Asn Asn He He Gin His Tyr Ser Asn His Val Ala Phe Leu Leu Asp 
580 585 590 



Met Gly Glu Leu Asp Gly Lys He Gin He He Leu Lys Glu Leu 
595 600 605 



<210> 9 

<211> 3113 

<212> DNA 

<213> MURINE 

<220> 

< 2 2 1 > mi s c_feature 

<222> (2634) . . (2634) 

<223> n = any nucleotide 



<220> 

< 2 2 1 > mis c_f e a ture 

<222> (2968) . . (2968) 

<223> n = any nucleotide 



<400> 9 
gttcctccat 


gggttccttg 


agttcctgac 


atggcttccc 


ttgatgatga 


actgtgtgac 


60 


ctaaacagca 


taccaaatgt 


gacggagcag 


cccctcattt 


ctgctggaga 


aaacagggta 


120 


caagtgctga 


aaaacgtgcc 


cttcaacatc 


gtcctccccc 


atagcaacca 


gctgggcatt 


180 


gataagagag 


gccatctgac 


agctcccgat 


acaacagtca 


ctgtctccat 


agcgaccatg 


240 


cctacccact 


ccatcaagac 


agaaatccag 


ccgcacggct 


ttgctgtggg 


aatccctcca 


300 


gccgtgtacc 


actctgagcc 


caccgaacgc 


gtggtggttt 


ttgaccggag 


cctcagcact 


360 


gatcagttca 


gctctggcac 


tcagcccccc 


aatgctcagc ggaggactcc 


agactccacc 


420 
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ccttcaagga 


gggcgttcaq 
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gaggttttct 


tcccctcgga 


actcagcctt 


480 


cqqatqccqq 


gcatgaattc 


agaggactat 


gtctttgaca 


atgtttctgg 


gaacaacttt 


540 


gagtataccc 


tggaagcctc 


caagtcactg 


eggcagaage 


aaggggacag 


cactatgaca 


600 


tacctgaata 


aaggccagtt 


ctatcctgtc 


accttaaagg 


aaggaagcag 


caatgaaggg 


660 


attcaccacc 


ctatcagcaa 


agttcgaagt 


qtqatcatqq 


tggtttttgc 


tgaagacaaa 


720 


agcagagaag 


accagctgag 


acactggaag 


tactggcact 


cccgtcagca 


cacggccaaa 


780 


caqaqqtqca 


ttgacattgc 


tgactacaaa 


gaaagtttca 


acactatcag 


caacattgag 


840 


gagatagctt 


ataacgccat 


ttccttcacg 


tgggacatca 


atqatqaqac 


aaaggtcttc 


900 


atctctgtga 


actgcttgag 


cacagatttc 


tcttctcaga 


aqqqtqtqaa 


qqqettqeca 

333 s "- l '3 wi '"- 


960 


ctcaacattc 


aaatcgacac 


atacagctat 


aacaaccgca 


gcaacaagcc 


qqttcaccqq 


1020 


gcctactgcc 


agataaaggt 


cttctgcgac 


aaqqqaqctq 


aaaggaaaat 


tegggatgaa 


1080 


gaacgaaaac 


acraQcaacracr 


aaaagtgtct 


gacgttaaag 


tgeagctget 


tccctcacac 


1140 


aaacggacag 


acatcacagt 


gttcaagccc 


ttcctggacc 


tcgacactca 


gcctgtcctc 


1200 


ttcattccgg 


acgtgcattt 


taccaacctg 


caqcqqqqca 


gtcatgttct 


ttccctcccc 


1260 


tctgaagaac 


tqqaaqqtqa 

*-3«-~ 


aggctctgtc 


ttgaaaagag 


qqccattcqq 

33^^^ *— "^33 


aaccgaagat 


1320 


gactttggag 


t tec tec tec 


tgctaagctg 


acteggacag 


aagaacccaa 


qaqaqtqctq 

3^3 *"*3 *"3 ^ 3 


1380 


ctctatgtcc 


gaaaggaatc 


agaagaagtc 


ttcgacgccc 


tgatgctcaa 


qacqccqtct 


1440 


ttqaaqqqcc 


tgataqaggc 


aatttcagac 


aagtatgatg 


tcccccatga 


caagattggg 


1500 


aaaatattta 


agaagtgcaa 


aaaagggatc 


ctegtgaaca 


tggacgacaa 


cattgtgaag 


1560 


cactactcca 


atgaggacac 


cttccagctg 


cagatagagg 


aagccqqcgq 


ctegtacaag 


1620 


ctcaccctga 


cagagattta 


aaggggcaqq 


qqtqqqqqqc 


gctcggctcc 


caqgcqtqqq 

3 3 v 3 333 


1680 


aattcagtga 


aagtgttcca 


gctgagaagc 


ccaggcacct 


accctgcaga 


accttaaata 


1740 


tcagggaagg 


aacctttcac 


gtaggaaatg 


gcgctgtgta 


taccgtgctg 


tgttgatgtt 


1800 


ttcttttgga 


tagaaatcca 


tgtgttgttt 


tgttgttgtt 


gtttgaattt 


ctgatgtgct 


1860 


tagaaagcga 


agcatgagaa 


ctttgtaccg 


gatctaagag 


accatgggac 


cgtttgggtt 


1920 


acctgctcca 


ctacctgtca 


aagtctgect 


gtgtccataa 


gagtggtggg 


ctactggctg 


1980 


gcgagagagg 


ggaaggcagt 


agcttgtctt 


tgaggctttt 


gtgttctcgc 


ctgacctcag 


2040 


tctaactctg 


actgecttga 


ggagtgggcc 


cagccctcag 


caataaaggg 


ctaagccttc 


2100 
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tccctccacc 


tctcctccag 


tgtttactaa 


atagggtgca 


ttcctggaac 


cttttcccgc 


2160 


aacttccctt 


ggacatgtgg 


actgcctttc 


tgatgaagaa 


cttgcgtgag 


tgacagtgtg 


2220 


aagttagctc 


tgttaaagct 


gcgttgtata 


taagtgcaat 


atctttttga 


aggtctgcct 


2280 


gtaaatgtgt 


acatatatgt 


ctgatataaa 


tatataatat 


ataaatgcgg 


tgtctgtgta 


2340 


cagatagtga 


aggcgagcag 


gaagatctac 


cttgaaatcc 


ctcttagaga 


agaggttaag 


2400 


ttattattga 


taatgtggac 


caagcaggta 


gaacgctgtt 


ttcccaaaaa 


caagcaagtg 


2460 


ttccctagca 


tagcaaaaag 


ccatctcatg 


tggcagagcc 


atctgctctt 


gcgaatgttg 


2520 


tcaccgtgtg 


ggtttctgca 


ccctgagtgg 


agctaatgga 


agactggact 


gcagctacta 


2580 


tatgaggtgt 


gtgtgcaggt 


gtcagccaag 


ctgtgcccat 


gcagagactc 


agcngtgtca 


2640 


tgagccagcg 


attcaaacca 


aaatgggccg 


attctacaag 


gccatgtttc 


agagcttcca 


2700 


agcatcagct 


accgtgtgtt 


tgaactggaa 


ggcattcatg 


aatttacata 


actgtggcag 


2760 


gggaatgttt 


tgtgcacact 


taaatattta 


agaacaaaac 


gaaactttac 


aatgtaaytt 


2820 


tataatgaat 


cctgtaacag 


aaatacaatt 


gcgggtttct 


ttaggttcag 


ggaactagaa 


2880 


taggtcattt 


gtatgagtag 


gattgttagc 


ggtatacgta 


rgttaaaaag 


tactctaatg 


2940 


aagtatgtga 


acaaaatagc 


tggttttnta 


agatacggga 


tacgggtcat 


ataacaatat 


3000 


tttctatttt 


gttttatgaa 


atcagcttta 


cttgttttaa 


ttgtatcatt 


gaacatgtgt 


3060 


tttaaaccaa 


agggattgaa 


ttttatatgt 


ctatttcaaa 


aaaaaaaaaa 


aaa 


3113 



<210> 10 

<211> 536 

<212> PRT 

<213> MURINE 

<400> 10 

Met Ala Ser Leu Asp Asp Glu Leu Cys Asp Leu Asn Ser lie Pro Asn 
1 5 10 15 

Val Thr Glu Gin Pro Leu lie Ser Ala Gly Glu Asn Arg Val Gin Val 
20 25 30 

Leu Lys Asn Val Pro Phe Asn lie Val Leu Pro His Ser Asn Gin Leu 
35 40 45 

Gly He Asp Lys Arg Gly His Leu Thr Ala Pro Asp Thr Thr Val Thr 
50 55 60 
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Val Ser He Ala Thr Met Pro Thr His Ser He Lys Thr Glu He Gin 
65 70 75 80 

Pro His Gly Phe Ala Val Gly He Pro Pro Ala Val Tyr His Ser Glu 
85 90 95 

Pro Thr Glu Arg Val Val Val Phe Asp Arg Ser Leu Ser Thr Asp Gin 
100 105 HO 

Phe Ser Ser Gly Thr Gin Pro Pro Asn Ala Gin Arg Arg Thr Pro Asp 
115 120 125 

Ser Thr Phe Ser Glu Thr Phe Lys Glu Gly Val Gin Glu Val Phe Phe 
130 135 140 

Pro Ser Glu Leu Ser Leu Arg Met Pro Gly Met Asn Ser Glu Asp Tyr 
145 150 155 160 

Val Phe Asp Asn Val Ser Gly Asn Asn Phe Glu Tyr Thr Leu Glu Ala 
165 170 175 



Ser Lys Ser Leu Arg Gin Lys Gin Gly Asp Ser Thr Met Thr Tyr Leu 
180 185 190 

Asn Lys Gly Gin Phe Tyr Pro Val Thr Leu Lys Glu Gly Ser Ser Asn 
195 200 205 

Glu Gly He His His Pro He Ser Lys Val Arg Ser Val He Met Val 
210 215 220 

Val Phe Ala Glu Asp Lys Ser Arg Glu Asp Gin Leu Arg His Trp Lys 
225 230 235 240 

Tyr Trp His Ser Arg Gin His Thr Ala Lys Gin Arg Cys He Asp He 
245 250 255 



Ala Asp Tyr Lys Glu Ser Phe Asn Thr He Ser Asn He Glu Glu He 
260 265 270 

Ala Tyr Asn Ala He Ser Phe Thr Trp Asp He Asn Asp Glu Ala Lys 
275 280 285 
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Val Phe lie Ser Val Asn Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys 
290 295 300 

Gly Val Lys Gly Leu Pro Leu Asn He Gin He Asp Thr Tyr Ser Tyr 
305 310 315 * 320 

Asn Asn Arg Ser Asn Lys Pro Val His Arg Ala Tyr Cys Gin He Lys 
325 330 335 

Val Phe Cys Asp Lys Gly Ala Glu Arg Lys He Arg Asp Glu Glu Arg 
340 345 " 350 

Lys Gin Ser Lys Arg Lys Val Ser Asp Val Lys Val Gin Leu Leu Pro 
355 360 365 

Ser His Lys Arg Thr Asp He Thr Val Phe Lys Pro Phe Leu Asp Leu 
370 375 ■ 380 

Asp Thr Gin Pro Val Leu Phe He Pro Asp Val His Phe Thr Asn Leu 
385 390 395 400 

Gin Arg Gly Ser His Val Leu Ser Leu Pro Ser Glu Glu Leu Glu Gly 
405 410 415 

Glu Gly Ser Val Leu Lys Arg Gly Pro Phe Gly Thr Glu Asp Asp Phe 
420 425 430 

Gly Val Pro Pro Pro Ala Lys Leu Thr Arg Thr Glu Glu ' Pro Lys Arg 
435 440 445 

Val Leu Leu Tyr Val Arg Lys Glu Ser Glu Glu Val Phe Asp Ala Leu 
450 455 460 



Met Leu Lys Thr Pro Ser Leu Lys Gly Leu Met Glu Ala He Ser Asp 

465 470 475 480 

Lys Tyr Asp Val Pro His Asp Lys He Gly Lys He Phe Lys Lys Cys 

485 490 495 

Lys Lys Gly He Leu Val Asn Met Asp Asp Asn He Val Lys His Tyr 

500 505 510 

Ser Asn Glu Asp Thr Phe Gin Leu Gin He Glu Glu Ala Gly Gly Ser 
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515 520 525 



Tyr Lys Leu Thr Leu Thr Glu He 
530 535 



<210> 11 

<211> 3452 

<212> DNA 

<213> MURINE 

<220> 

< 2 2 1 > mi s cofeature 

<222> (2973) . . (2973) 

<223> n = any nucleotide 



<220> 

<221> misc_f eature 

<222> (3307) . . (3307) 

<223> n = any nucleotide 



60 



<400> 11 

cgccgctccg gacccaccgc ctgccgccgc gcgccgcccg ccgccgcctc ctccccccgg 

atcgggtgta ctgtcccaac ccgaaagtcc agttctgcgg cccggcagcg gcgagcgagc 120 

gcgatgacac aggagtacga caacaaaagg cccgtgctgg tacttcagaa tgaagccctc 180 

tacccacagc ggcgctccta taccagtgag gatgaagcct ggaagtcgtt cctggaaaac 240 

cctctcactg cggcaaccaa agcgatgatg agcatcaacg gagacgaaga cagcgcggct 3 00 

gcgctgggcc tgctctatga ctactacaag gtccccagag agcgccggtc atcagccgta 3 60 

aagcccgagg gagagcaccc agagccagag cacagcaaaa gaaacagcat accaaatgtg 420 

acggagcagc ccctcatttc tgctggagaa aacagggtac aagtgctgaa aaacgtgccc 4 80 

ttcaacatcg tcctccccca tagcaaccag ctgggcattg ataagagagg ccatctgaca 540 

gctcccgata caacagtcac tgtctccata gcgaccatgc ctacccactc catcaagaca 600 

gaaatccagc cgcacggctt tgctgtggga atccctccag ccgtgtacca ctctgagccc 660 

accgaacgcg tggtggtttt tgaccggagc ctcagcactg atcagttcag ctctggcact 720 

cagcccccca atgctcagcg gaggactcca gactccacct tctccgagac cttcaaggag 780 

ggcgttcagg aggttttctt cccctcggaa ctcagccttc ggatgccggg catgaattca 840 

gaggactatg tctttgacaa tgtttctggg aacaactttg agtataccct ggaagcctcc 900 

aagtcactgc ggcagaagca aggggacagc actatgacat acctgaataa aggccagttc 960 
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tatcctgtca ccttaaagga aggaagcagc aatgaaggga ttcaccaccc tatcagcaaa 1020 

gttcgaagtg tgatcatggt ggtttttgct gaagacaaaa gcagagaaga ccagctgaga 1080 

cactggaagt actggcactc ccgtcagcac acggccaaac agaggtgcat tgacattgct 1140 

gactacaaag aaagtttcaa cactatcagc aacattgagg agatagctta taacgccatt 12 00 

tccttcacgt gggacatcaa tgatgaggca aaggtcttca tctctgtgaa ctgcttgagc 1260 

acagatttct cttctcagaa gggtgtgaag ggcttgccac tcaacattca aatcgacaca 1320 

tacagctata acaaccgcag caacaagccg gttcaccggg cctactgcca gataaaggtc 1380 

ttctgcgaca agggagctga aaggaaaatt cgggatgaag aacgaaaaca gagcaagaga 1440 

aaagtgtctg acgttaaagt gcagctgctt ccctcacaca aacggacaga catcacagtg 1500 

ttcaagccct tcctggacct cgacactcag cctgtcctct tcattccgga cgtgcatttt 1560 

accaacctgc agcggggcag tcatgttctt tccctcccct ctgaagaact ggaaggtgaa 1620 

ggctctgtct tgaaaagagg gccattcgga accgaagatg actttggagt tcctcctcct 1680 

gctaagctga ctcggacaga agaacccaag agagtgctgc tctatgtccg aaaggaatca 174 0 

gaagaagtct tcgacgccct gatgctcaag acgccgtctt tgaagggcct gatggaggca 1800 

atttcagaca agtatgatgt cccccatgac aagattggga aaatatttaa gaagtgcaaa 1860 

aaagggatcc tcgtgaacat ggacgacaac attgtgaagc actactccaa tgaggacacc 192 0 

ttccagctgc agatagagga agccggcggc tcgtacaagc tcaccctgac agagatttaa 1980 

aggggcaggg gtggggggcg ctcggctccc aggcgtggga attcagtgaa agtgttccag 2040 

ctgagaagcc caggcaccta ccctgcagaa ccttaaatat cagggaagga acctttcacg 2100 

taggaaatgg cgctgtgtat accgtgctgt gttgatgttt tcttttggat agaaatccat 2160 

gtgttgtttt gttgttgttg tttgaatttc tgatgtgctt agaaagcgaa gcatgagaac 2220 

tttgtaccgg atctaagaga ccatgggacc gtttgggtta cctgctccac tacctgtcaa 2280 

agtctgcctg tgtccataag agtggtgggc tactggctgg cgagagaggg gaaggcagta 2340 

gcttgtcttt gaggcttttg tgttctcgcc tgacctcagt ctaactctga ctgccttgag 2400 

gagtgggccc agccctcagc aataaagggc taagccttct ccctccacct ctcctccagt 2460 

gtttactaaa tagggtgcat tcctggaacc ttttcccgca acttcccttg gacatgtgga 2520 

ctgcctttct gatgaagaac ttgcgtgagt gacagtgtga agttagctct gttaaagctg 2580 

cgttgtatat aagtgcaata tctttttgaa ggtctgcctg taaatgtgta catatatgtc 2640 

tgatataaat atataatata taaatgcggt gtctgtgtac agatagtgaa ggcgagcagg 2700 
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aagatctacc 


ttgaaatccc 


tcttagagaa 


gaggttaagt 


tattattgat 


aatgtggacc 


2760 


aagcaggtag 


aacgctgttt 


tcccaaaaac 


aagcaagtgt 


tccctagcat 


agcaaaaagc 


2820 


catctcatgt 


ggcagagcca 


tctgctcttg 


cgaatgttgt 


caccqtqtqcr 


gtttctgcac 


2880 


cctgagtgga 


gctaatggaa 


gactggactg 


cagctactat 


atqaqqtatci 


tqtacaqqtq 


2940 


tcagccaagc 


tgtgcccatg 


cagagactca 


gcngtgtcat 


cyaciccacrccra 


ttcaaaccaa 


3000 


aatgggccga 


ttctacaagg 


ccatgtttca 


gagcttccaa 


gcatcagcta 


ccgtgtgttt 


3060 


gaactggaag 


gcattcatga 


atttacataa 


ctgtggcagg 


ggaatgtttt 


gtgcacactt 


3120 


aaatatttaa 


gaacaaaacg 


aaactttaca 


atgtaayttt 


ataatgaatc 


ctgtaacaga 


3180 


aatacaattg 


cgggtttctt 


taggttcagg 


gaactagaat 


aggtcatttg 


tatgagtagg 


3240 


attgttagcg 


gtatacgtar 


gttaaaaagt 


actctaatga 


agtatgtgaa 


caaaatagct 


3300 


ggttttntaa 


gatacgggat 


acgggtcata 


taacaatatt 


ttctattttg 


ttttatgaaa 


3360 


tcagctttac 


ttgttttaat 


tgtatcattg 


aacatgtgtt 


ttaaaccaaa 


gggattgaat 


3420 


tttatatgtc 


tatttcaaaa 


aaaaaaaaaa 


aa 






3452 



<210> 12 

<211> 618 

<2i2> PRT 

<213> MURINE 

<400> 12 

Met Thr Gin Glu Tyr Asp Asn Lys Arg Pro Val Leu Val Leu Gin Asn 
15 10 15 

Glu Ala Leu Tyr Pro Gin Arg Arg Ser Tyr Thr Ser Glu Asp Glu Ala 
20 25 30 

Trp Lys Ser Phe Leu Glu Asn Pro Leu Thr Ala Ala Thr Lys Ala Met 
35 40 45 

Met Ser lie Asn Gly Asp Glu Asp Ser Ala Ala Ala Leu Gly Leu Leu 
50 55 60 

Tyr Asp Tyr Tyr Lys Val Pro Arg Glu Arg Arg Ser Ser Ala Val Lys 
65 70 75 80 

Pro Glu Gly Glu His Pro Glu Pro Glu His Ser Lys Arg Asn Ser He 
85 90 95 
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Pro Asn Val Thr Glu Gin Pro Leu He Ser Ala Gly Glu Asn Arg Val 
100 105 HO 

Gin Val Leu Lys Asn Val Pro Phe Asn He Val Leu Pro His Ser Asn 
115 120 125 

Gin Leu Gly He Asp Lys Arg Gly His Leu Thr Ala Pro Asp Thr Thr 
130 ^ 135 140 

Val Thr Val Ser He Ala Thr Met Pro Thr His Ser He Lys Thr Glu 
145 150 155 160 

He Gin Pro His Gly Phe Ala Val Gly He Pro Pro Ala Val Tyr His 
165 170 175 

Ser Glu Pro Thr Glu Arg Val Val Val Phe Asp Arg Ser Leu Ser Thr 
180 185 190 

Asp Gin Phe Ser Ser Gly Thr Gin Pro Pro Asn Ala Gin Arg Arg Thr 
195 200 205 

Pro Asp Ser Thr Phe Ser Glu Thr Phe Lys Glu Gly Val Gin Glu Val 
210 215 220 

Phe Phe Pro Ser Glu Leu Ser Leu Arg Met Pro Gly Met Asn Ser Glu 
225 230 235 240 

Asp Tyr Val Phe Asp Asn Val Ser Gly Asn Asn Phe Glu Tyr Thr Leu 
245 250 255 

Glu Ala Ser Lys Ser Leu Arg Gin Lys Gin Gly Asp Ser Thr Met Thr 
260 265 270 

Tyr Leu Asn Lys Gly Gin Phe Tyr Pro Val Thr Leu Lys Glu Gly Ser 
275 280 285 

Ser Asn Glu Gly He His His Pro He Ser Lys Val Arg Ser Val He 
290 295 300 

Met Val Val Phe Ala Glu Asp Lys Ser Arg Glu Asp Gin Leu Arg His 
305 310 315 320 
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Trp Lys Tyr Trp His Ser Arg Gin His Thr Ala Lys Gin Arg Cy s He 
325 33 0 335 

Asp He Ala Asp Tyr Lys Glu Ser Phe Asn Thr He Ser Asn He Glu 
340 345 350 

Glu He Ala Tyr Asn Ala He Ser Phe Thr Trp Asp He Asn Asp Glu 
355 360 365 

Ala Lys Val Phe He Ser Val Asn Cys Leu Ser Thr Asp Phe Ser Ser 
370 375 3 8 o 

Gin Lys Gly Val Lys Gly Leu Pro Leu Asn He Gin He Asp Thr Tyr 
385 390 395 ~ 400 

Ser Tyr Asn Asn Arg Ser Asn Lys Pro Val His Arg Ala Tyr Cys Gin 
40S 410 415 

He Lys Val Phe Cys Asp Lys Gly Ala Glu Arg Lys He Arg Asp Glu 
420 425 430 

Glu Arg Lys Gin Ser Lys Arg Lys Val Ser Asp Val Lys Val Gin Leu 
435 440 445 

Leu Pro Ser His Lys Arg Thr Asp He Thr Val Phe Lys Pro Phe Leu 
450 455 460 

Asp Leu Asp Thr Gin Pro Val Leu Phe He Pro Asp Val His Phe Thr 
465 470 475 480 

Asn Leu Gin Arg Gly Ser His Val Leu Ser Leu Pro Ser Glu Glu Leu 
485 490 495 

Glu Gly Glu Gly Ser Val Leu Lys Arg Gly Pro Phe Gly Thr Glu Asn 
500 505 510 

Asp Phe Gly Val Pro Pro Pro Ala Lys Leu Thr Arg Thr Glu Glu Pro 
515 5 2 o 525 

Lys Arg Val Leu Leu Tyr Val Arg Lys Glu Ser Glu Glu Val Phe Aso 
530 535 540 P 
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Ala Leu Met Leu Lys Thr Pro Ser Leu Lys Gly Leu Met Glu Ala He 
545 550 555 560 

Ser Asp Lys Tyr Asp Val Pro His Asp Lys He Gly Lys He Phe Lys 
565 570 575 

Lys Cys Lys Lys Gly He Leu Val Asn Met Asp Asp Asn He Val Lys 
580 585 590 

His Tyr Ser Asn Glu Asp Thr Phe Gin Leu Gin He Glu Glu Ala Gly 
595 600 605 

Gly Ser Tyr Lys Leu Thr Leu Thr Glu He 
610 615 

<210> 13 

<211> 2195 

<212> DNA 

<213> murine 

<400> 13 

cgcccgggca ggtcagactt gaaagtccag tttcaccaga ggctgaggct ccaggaaaag 60 

gggagcgagt tcattggatc aaacatgtca caagagtcgg acaataataa aagactagtg 120 

gccttagtgc ccatgcccag tgaccctccc ttcaacaccc gaagagccta cacaagtgag 180 

gatgaggcct ggaagtcata tctggagaac cccctgactg cggccaccaa ggcgatgatg 240 

agcatcaacg gggacgagga cagtgctgcc gccctgggcc tgctctatga ctactacaag 300 

gttcctcgag acaagagact tctgtctgtg agcaaagcaa gtgacagcca agaagaccag 360 

gataaaagaa actgccttgg caccagtgaa gcccagatca atttgagcgg aggcgagaac 420 

agagtgcagg ttctgaagac tgtcccggtg aacctctgtc taagtcaaga ccacatggag 480 

aattcgaagc gcgagcagta cagtgtatcc atcaccgaga gctctgccgt catccccgtg 540 

tcaggcatca ccgtggtgaa agccgaggat ttcacaccgg tgttcatggc gcccccggtg 600 

cactatcccc gcgcggacag tgaggagcag cgcgtggtta tctttgaaca gactcagtac 660 

gacctgccct ccatagccag ccacagctcc tatctcaagg acgaccagcg cagcacgccg 72 0 

gacagcacct acagcgagag ctttaaggac ggcgcctcgg agaaatttcg gagtacttct 7 80 

gttggtgctg acgagtatac atatgaccag acgggaagtg gtacatttca gtacaccctg 840 

gaagccacca aatctctccg tcagaaacag ggggagggcc ccatgaccta cctcaacaaa 900 

ggacaattct atgccataac actcagtgag actggagaca acaaatgctt ccgacacccc 960 
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atcagcaaag 


tcaggagtgt 


ggtgatggtg gtctttagtg aagacaaaaa 


ccgagatgag 


1020 


cagctgaaat 


actggaagta 


ctggcactcc 


cggcagcaca 


ctgccaagca 


gagggtcctt 


1080 


gacattgctg 


attacaagga 


gagcttcaac 


accatcggga 


acattgaaga 


gatcgcatac 


1140 


aatgctgttt 


ccttcacctg 


ggatgtgaac 


gaggaggcaa 


agatttttat 


caccgtgaat 


1200 


tgcctgagta 


cagatttctc 


ctcccaaaag 


ggtgtaaaag 


gacttcccct 


gatgattcag 


1260 


atcgacacgt 


acagctacaa 


caaccgcagc 


aataaaccca 


tccacagagc 


atactgccag 


1320 


atcaaggtct 


tctgtgacaa 


gggagcagaa 


agaaaaatcc 


gggatgaaga 


gagaaagcag 


1380 


aacaggaaga 


aagggaaggg 


ccaggcctct 


caagcccagt 


gcaacaactc 


ctctgatggg 


1440 


aagatggccg 


ccataccgtt 


acagaagaag 


agtgacatca 


cgtacttcaa 


aaccatgccc 


1500 


gacctgcact 


cacagcctgt 


gctcttcata 


ccagatgttc 


actttgcaaa 


cctacagagg 


1560 


accggacagg 


tttattacaa 


cacagacgat 


gagcgagaag 


gcagcagcgt 


ccttgttaag 


1620 


cggatgttca 


ggcccatgga 


agaggagttt 


ggtccaacac 


cgtctaagca 


gatcaaagaa 


1680 


gaaaacgtaa 


aacgagtgct 


tttatatgtg 


aggaaggaga 


acgafcgacgt 


cttcgatgct 


1740 


ctgatgctga 


aatcacccac 


ggtgaagggt 


ctgatggaag 


cgctgtctga 


gaagtatggg 


1800 


cfccrccaatcra 


aaaaaatcac 


aaagctttat 


aagaagagca 


aaaagggcat 


a a t act t r*a a c* 


1860 


atggatgaca 


acatcattga 


gcactattca 


aatgaggaca 


ccttcatcct 


caacatggag 


1920 


agcatggtgg 


aaggcttcaa 


gatcacgctg 


atggagatct 


gagccctggg 


tgtcccctcg 


1980 


ataggagctt 


ttggtatact 


ccttcctggg 


agagatggga 


tctctgccgc 


cccaggacct 


2040 


ggagacccac 


ccatctcact 


cacctctcaa 


gactgttaca 


agactgctgg 


gaaggggggc 


2100 


agggcccaag 


gcccagtaat 


ggacttcctt 


caactcttcc 


acttgctccc 


tatggagctg 


2160 


aagcctgagc 


ccctcagcaa 


atttcttctc 


gtgcc 






2195 



<210> 14 

<211> 625 

<212> PRT 

<213> murine 

<400> 14 

Met Ser Gin Glu Ser Asp Asn Asn Lys Arg Leu Val Ala Leu Val Pro 
15 10 15 



Met Pro Ser Asp Pro Pro Phe Asn Thr Arg Arg Ala Tyr Thr Ser Glu 
20 25 30 
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Asp Glu Ala Trp Lys Ser Tyr Leu Glu Asn Pro Leu Thr Ala Ala Thr 
3 5 40 45 

Lys Ala Met Met Ser lie Asn Gly Asp Glu Asp Ser Ala Ala Ala Leu 
50 55 60 

Gly Leu Leu Tyr Asp Tyr Tyr Lys Val Pro Arg Asp Lys Arg Leu Leu 
65 70 ~ 75 80 

Ser Val Ser Lys Ala Ser Asp Ser Gin Glu Asp Gin Asp Lys Arg Asn 
85 90 95 

Cys Leu Gly Thr Ser Glu Ala Gin lie Asn Leu Ser Gly Gly Glu Asn 
100 105 110 

Arg Val Gin Val Leu Lys Thr Val Pro Val Asn Leu Cys Leu Ser Gin 
115 120 125 

Asp His Met Glu Asn Ser Lys Arg Glu Gin Tyr Ser Val Ser lie Thr 
130 135 140 

Glu Ser Ser Ala Val lie Pro Val Ser Gly lie Thr Val Val Lys Ala 
145 150 155 160 

Glu Asp Phe Thr Pro Val Phe Met Ala Pro Pro Val His Tyr Pro Arg 
165 170 175 



Ala Asp Ser Glu Glu Gin Arg Val Val He Phe Glu Gin Thr Gin Tyr 
18 0 185 190 

Asp Leu Pro Ser He Ala Ser His Ser Ser Tyr Leu Lys Asp Asp Gin 
195 200 205 

Arg Ser Thr Pro Asp Ser Thr Tyr Ser Glu Ser Phe Lys Asp Gly Ala 
210 215 220 

Ser Glu Lys Phe Arg Ser Thr Ser Val Gly Ala Asp Glu Tyr Thr Tyr 
225 230 235 240 



Asp Gin Thr Gly Ser Gly Thr Phe Gin Tyr Thr Leu Glu Ala Thr Lys 
245 250 255 
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Ser Leu Arg Gin Lys Gin Gly Glu Gly Pro Met Thr Tyr Leu Asn Lys 
260 265 270 



Gly Gin Phe Tyr Ala lie Thr Leu Ser Glu Thr Gly Asp Asn Lys Cys 
275 280 285 



Phe Arg His Pro He Ser Lys Val Arg Ser Val Val Met Val Val Phe 
290 295 300 



Ser Glu Asp Lys Asn Arg Asp Glu Gin Leu Lys Tyr Trp Lys Tyr Trp 
305 310 315 ^ 320 



His Ser Arg Gin His Thr Ala Lys Gin Arg Val Leu Asp He Ala Asp 
325 330 335 



Tyr Lys Glu Ser Phe Asn Thr He Gly Asn He Glu Glu He Ala Tyr 
340 345 350 



Asn Ala Val Ser Phe Thr Trp Asp Val Asn Glu Glu Ala Lys He Phe 
355 360 365 



He Thr Val Asn Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly Val 
370 375 380 



Lys Gly Leu Pro Leu Met He Gin He Asp Thr Tyr Ser Tyr Asn Asn 
385 390 395 ^ 400 



Arg Ser Asn Lys Pro He His Arg Ala Tyr Cys Gin He Lys Val Phe 
405 410 415 



Cys Asp Lys Gly Ala Glu Arg Lys He Arg Asp Glu Glu Arg Lys Gin 
420 425 430 



Asn Arg Lys Lys Gly Lys Gly Gin Ala Ser Gin Ala Gin Cys Asn Asn 
435 440 445 



Ser Ser Asp Gly Lys Met Ala Ala He Pro Leu Gin Lys Lys Ser Asp 
450 455 460 



He Thr Tyr Phe Lys Thr Met Pro Asp Leu His Ser Gin Pro Val Leu 
465 470 475 480 
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Phe He Pro Asp Val His Phe Ala Asn Leu Gin Arg Thr Gly Gin Val 
485 490 495 



Tyr Tyr Asn Thr Asp Asp Glu Arg Glu Gly Ser Ser Val Leu Val Lys 
500 505 510 



Arg Met Phe Arg Pro Met Glu Glu Glu Phe Gly Pro Thr Pro Ser Lys 
515 52 0 525 



Gin He Lys Glu Glu Asn Val Lys Arg Val Leu Leu Tyr Val Arg Lys 
530 535 540 



Glu Asn Asp Asp Val Phe Asp Ala Leu Met Leu Lys Ser Pro Thr Val 
545 550 555 560 



Lys Gly Leu Met Glu Ala Leu Ser Glu Lys Tyr Gly Leu Pro Val Glu 
565 570 575 



Lys He Thr Lys Leu Tyr Lys Lys Ser Lys Lys Gly He Leu Val Asn 
580 585 590 



Met Asp Asp Asn He He Glu His Tyr Ser Asn Glu Asp Thr Phe He 
595 600 605 



Leu Asn Met Glu Ser Met Val Glu Gly Phe Lys He Thr Leu Met Glu 
610 615 620 



He 
625 



<210> 15 

<211> 2831 

<212> DNA 

<213> murine 

<220> 

<221> CDS 

<222> (200) . . (2008) 

<223> 



<220> 

<221> misc_f eature 

<222> (2806) . . (2806) 

<223> n = any nucleotide 
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<400> 15 

acctgtgctt ccagccaatc agcgccaccg cagccgggga ccgctgtcag caaaatctca 60 

acatccagag cgcaacgtag agcaaacgct tccccgggca ggaagggaat gtctgtgtca 12 0 

gaggagaatt aagagacgag tggtcagcag cgcctgcgag ccaaccagag acggatcgct 18 0 

ggaacctcgg agaaggaag atg teg aat gaa ctt gat ttc agg tct gtg egg 

Met Ser Asn Glu Leu Asp Phe Arg Ser Val Arg 
15 10 

ttg ctg aag aat gac cct gtg age ttc cag aag ttt ccc tac agt aat 
Leu Leu Lys Asn Asp Pro Val Ser Phe Gin Lys Phe Pro Tyr Ser Asn 
15 20 25 



gac aat ggc tec etc aac tea tta ttt gag age att cat ggg gtt cca 

Asp Asn Gly Ser Leu Asn Ser Leu Phe Glu Ser lie His Gly Val Pro 
175 180 185 

ccc aca cag cgc tgg cag cca gac age acc ttc aaa gat gac cca cag 

Pro Thr Gin Arg Trp Gin Pro Asp Ser Thr Phe Lys Asp Asp Pro Gin 



232 



280 



gag gac gag gee tgg aag aca tac ctg gag aac cct ttg acg get gee 32 8 

Glu Asp Glu Ala Trp Lys Thr Tyr Leu Glu Asn Pro Leu Thr Ala Ala 
30 35 40 

acc aaa gee atg atg aga gtc aac ggg gac gag gag agt gtg get get 376 
Thr Lys Ala Met Met Arg Val Asn Gly Asp Glu Glu Ser Val Ala Ala 
45 50 55 

ctg age ttc etc tac gac tac tat atg ggt ccc aag gag aag egg ata 424 
Leu Ser Phe Leu Tyr Asp Tyr Tyr Met Gly Pro Lys Glu Lys Arg lie 
60 « 70 75 

ctg tec tec age act ggt ggc egg aat gac caa gga aag aag ttc tac 472 
Leu Ser Ser Ser Thr Gly Gly Arg Asn Asp Gin Gly Lys Lys Phe Tyr 
80 85 90 

cac age atg gac tat gag ccg gat ctt gec ccc etc gag age ccc aca 520 
His Ser Met Asp Tyr Glu Pro Asp Leu Ala Pro Leu Glu Ser Pro Thr 
95 100 105 

cac etc atg aaa ttt ttg aca gag aac gtg tct gga agt cca gac tac 568 
His Leu Met Lys Phe Leu Thr Glu Asn Val Ser Gly Ser Pro Asp Tyr 
110 H5 120 

aca gac cag etc aag aaa aac aat ctg eta ggc ttg gag ggg gtt eta 616 
Thr Asp Gin Leu Lys Lys Asn Asn Leu Leu Gly Leu Glu Gly Val Leu 
125 " 130 135 

ccc acc ccc ggc aag acc aat acc gtc ccc cca ggt ccg agt aaa ctg 664 
Pro Thr Pro Gly Lys Thr Asn Thr Val Pro Pro Gly Pro Ser Lys Leu 
140 145 150 155 

gaa gec age tec atg gac age tac etc ttg ccc gee agt gac ata tat 712 
Glu Ala Ser Ser Met Asp Ser Tyr Leu Leu Pro Ala Ser Asp He Tyr 
160 165 170 



760 



808 
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190 195 200 

gag tct ctg etc ttc cct gat att ctg aag aca tec ccg gac ccc cca 856 

Glu Ser Leu Leu Phe Pro Asp lie Leu Lys Thr Ser Pro Asp Pro Pro 
205 210 215 

tgc cca gag gat tat cca ggc etc aag agt gac ttt gaa tac acc ctg 904 

Cys Pro Glu Asp Tyr Pro Gly Leu Lys Ser Asp Phe Glu Tyr Thr Leu 
220 22 5 230 235 

ggc tec ccc aaa gee att cac ate aaa gca ggg gag tea ccc atg gee 952 

Gly Ser Pro Lys Ala lie His lie Lys Ala Gly Glu Ser Pro Met Ala 

2 40 245 250 

tac etc aac aag ggt cag ttc tac ccc gtc acc eta cgc acc cca gca 1000 

Tyr Leu Asn Lys Gly Gin Phe Tyr Pro Val Thr Leu Arg Thr Pro Ala 
255 260 265 



gga ggg aaa ggc etc get ctg tec tec age aaa gtc aag age gtg gtg 
Gly Gly Lys Gly Leu Ala Leu Ser Ser Ser Lys Val Lys Ser Val Val 
270 275 280 



gaa egg aag cag ttt cga agg aag gtc aag tgc cca gac tec agt aac 
Glu Arg Lys Gin Phe Arg Arg Lys Val Lys Cys Pro Asp Ser Ser Asn 
415 420 425 



1048 



atg gtc gtg ttc gat aat gac aag gtc ccc gtg gag cag ctg cgt ttc 1096 
Met Val Val Phe Asp Asn Asp Lys Val Pro Val Glu Gin Leu Arg Phe 
285 290 295 

tgg agg cac tgg cat tec egg cag ccc acc gee aag cag cgc gtc ate 1144 
Trp Arg His Trp His Ser Arg Gin Pro Thr Ala Lys Gin Arg Val lie 
300 305 310 315 

gac gta get gac tgt aag gaa aac ttc aac acg gtc cag cac att gaa 1192 
Asp Val Ala Asp Cys Lys Glu Asn Phe Asn Thr Val Gin His lie Glu 
320 325 330 

gag gtg gee tat aac gcg ctg tec ttt gtg tgg aat gtc aac gag gaa 1240 
Glu Val Ala Tyr Asn Ala Leu Ser Phe Val Trp Asn Val Asn Glu Glu 
335 340 345 

gee aag gtg ttt ate ggt gtc aac tgt ctg age aca gac ttc tec teg 12 88 

Ala Lys Val Phe lie Gly Val Asn Cys Leu Ser Thr Asp Phe Ser Ser 
350 355 360 

cag aag gga gtg aag ggt gtc ccc ctg aac ttg caa att gac acc tat 1336 
Gin Lys Gly Val Lys Gly Val Pro Leu Asn Leu Gin lie Asp Thr Tyr 
365 370 375 

gac tgt gga gca ggc act gag cgc ctg gta cac cgt get gtc tgc cag 13 84 

Asp Cys Gly Ala Gly Thr Glu Arg Leu Val His Arg Ala Val Cys Gin 
380 385 390 395 

ate aag ate ttc tgt gat aag gga get gag agg aag atg cgc gat gat 1432 
lie Lys lie Phe Cys Asp Lys Gly Ala Glu Arg Lys Met Arg Asp Asp 
400 405 410 



1480 
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aat gca gga ate aag ggc tgc ctg ctg tea ggc ttc agg age aat aaa 

116 LyS Gly <*» Leu Leu Ser Gly Phe A?g Ity A^n IS 
430 435 440 

acc aca tac ttg egg cea gaa act gac ctg gag acc cag cct ata ttc, 
Thr Thr Tyr Leu Arg Pro Glu Thr Asp Leu IS Thr Gin Pro S2 Leu 
S 450 455 

ttt ate ccc aat ctg cat ttt tec age eta cag cgc cca gga ggg gtt 
Phe lie Pro Asn Leu His Phe Ser Ser Leu Gin Arg Pro Ify Ify SSJ 

465 470 475 

gtc ccc tea gca gga cac age age tct gac agg ctg cct ctg aag cga 
Val Pro Ser Ala Gly His Ser Ser Ser Asp A?g Leu Pro Leu tyt A?g 
480 485 49o 

acc tgc tea ccc ttt get. gag gag ttt gag ect ctt cct tct aaa caa 
Thr Cys Ser Pro Phe Ala Glu Glu Phe IS Pro Leu Pro Ser Lyt Gin 
495 500 sos 

gee aag gaa gat gac ctt cag aga gtt ctg ttg tat gtg agg agg gag 
Ala Lys Glu Asp Asp Leu Gin Arg Val Leu Leu Tyr Va! A?g Sg IS 
510 515 520 

aca gag gag gtg ttt gac gcg etc atg ttg aag acc ccg gac ctg aag 
Thr Glu Glu Val Phe Asp Ala Leu Met Leu Lyf Thr Pro fsp Seu lJs 
a <* 5 530 535 

ggc ctg agg aat gcg ate tct gag aag tac ggc etc ccc gag gag aat. 
Gly Leu Arg Asn Ala He Ser Glu Lys Tyr Gly Leu Pro IS IS Asn 

545 550 555 

Til S fc ? ta ° aa9 aaa tgc aa 9 c 9 a aac at <= ctg gtt aac atg 

He Cys Lys Val Tyr Lys Lys Cys Lys Arg Gly He LeS Val As"n Sec 

560 565 570 



gac aac aac ate ate caa cac tac age aac cac gtg gee ttc ctg ctg 
Asp Asn Asn lie He Gin His Tyr Ser Asn His Va! Ala Phe Leu Leu 
575 580 585 

gac atg ggt gag ctg gac ggc aag ate cag ate ate eta aaa ma r-i-= 
Asp Met Gly Glu Leu Asp Gly Lys He Gin He He Leu Lys IS Leu 
590 595 goo 

tgagggcccg gcctcaagcg tcccacaccc ggggcccggc tcaagceacg tacaacctct 

tctgtgtcag ctgttacttg aaatgecttt ctttgggaaa gaggtctegc aagcaaccaa 

ctcggtgatg tccaagccag ggagagacca agaaggttcc aggatctaaa tgtcccaccc 

aggctcgaac tcactccaga gcttcctgaa agcacccagc ccaccggaga gtctgagcaa 

cacagaccca actgcctgct ttctcttcta agtcccgctg cagaggccct tacaggggac 

gggggtcaca ccaccttctc tgcagggcta cacccgctgt etcgateggt tetgaegtte 



1528 

1576 

1624 

1672 

1720 

1768 

1816 

1864 

1912 

1960 

2008 

2068 
2128 
2188 
2248 
2308 
2368 
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actgtttcct ttctaccaac ttcagaccag agagttctca cactttggcc aaataacttg 2428 
aaaactcgtg actttcacag cagatgcctt tgtgaggccc ttggagagga aactttctta 24 88 
ttgacttcct cggcacaaga tgtaagtcac catcatcgag ctgacaggaa caaataccct 2548 
tgccacctac tgttgtacac atttcttatt tacagttttc attatgtgat tatatatata 2608 
tatatgtaag tatatattat gtacatatat gcaacatttt gtatgtccat gttacatttt 2668 
tatcatttca aaaatatgta tttcatattt cttgaactat ttttttagct gttattcgat 
tatgcatttt gtatatcata gggtttagta ataaaagcct acccatgcac acttaaaaaa 
aaaaaaaaaa aaatatcnag cttatcgata ccgtcgacct cga 



<210> 16 

<211> 603 

<212> PRT 

<213> murine 

<220> 

<221> misc__feature 

<222> (2806) . . (2806) 

<223> n - any nucleotide 

<400> 16 

Met Ser Asn Glu Leu Asp Phe Arg Ser Val Arg Leu Leu Lys Asn Asp 
1 5 10 15 

Pro Val Ser Phe Gin Lys Phe Pro Tyr Ser Asn Glu Asp Glu Ala Trp 
20 25 30 

Lys Thr Tyr Leu Glu Asn Pro Leu Thr Ala Ala Thr Lys Ala Met Met 

3 5 



40 



45 



Arg Val Asn Gly Asp Glu Glu Ser Val Ala Ala Leu Ser Phe Leu Tyr 
50 55 60 

Asp Tyr Tyr Met Gly Pro Lys Glu Lys Arg He Leu Ser Ser Ser Thr 
65 70 75 80 

Gly Gly Arg Asn Asp Gin Gly Lys Lys Phe Tyr His Ser Met Asp Tyr 
85 90 95 

Glu Pro Asp Leu Ala Pro Leu Glu Ser Pro Thr His Leu Met Lys Phe 
100 105 no 



2728 
2788 
2831 
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Leu Thr Glu Asn Val Ser Gly Ser Pro Asp Tyr Thr Asp Gin Leu Lys 
115 120 125 



Lys Asn Asn Leu Leu Gly Leu Glu Gly Val Leu Pro Thr Pro Gly Lys 
13 0 135 140 



Thr Asn Thr Val Pro Pro Gly Pro Ser Lys Leu Glu Ala Ser Ser Met 
145 150 155 160 

Asp Ser Tyr Leu Leu Pro Ala Ser Asp lie Tyr Asp Asn Gly Ser Leu 
165 170 175 

Asn Ser Leu Phe Glu Ser lie His Gly Val Pro Pro Thr Gin Arg Trp 
180 185 190 



Gin Pro Asp Ser Thr Phe Lys Asp Asp Pro Gin Glu Ser Leu Leu Phe 
195 200 205 



Pro Asp He Leu Lys Thr Ser Pro Asp Pro Pro Cys Pro Glu Asp Tyr 
210 215 220 



Pro Gly Leu Lys Ser Asp Phe Glu Tyr Thr Leu Gly Ser Pro Lys Ala 
225 230 235 240 

He His lie Lys Ala Gly Glu Ser Pro Met Ala Tyr Leu Asn Lys Gly 
245 250 255 

Gin Phe Tyr Pro Val Thr Leu Arg Thr Pro Ala Gly Gly Lys Gly Leu 
260 265 270 

Ala Leu Ser Ser Ser Lys Val Lys Ser Val Val Met Val Val Phe Asp 
275 280 285 



Asn Asp Lys Val Pro Val Glu Gin Leu Arg Phe Trp Arg His Trp His 
290 295 300 



Ser Arg Gin Pro Thr Ala Lys Gin Arg Val He Asp Val Ala Asp Cys 

305 310 315 320 

Lys Glu Asn Phe Asn Thr Val Gin His He Glu Glu Val Ala Tyr Asn 

325 330 335 



Ala Leu Ser Phe Val Trp Asn Val Asn Glu Glu Ala Lys Val Phe He 
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340 



345 350 



Ely Val Asn Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly Val Lys 
355 360 365 

Gly Val Pro Leu Asn Leu Gin He Asp Thr Tyr Asp Cys Gly Ala Gly 

375 380 



370 



Thr Glu Arg Leu Val His Ar g Ala Val Cys Gin He Lys He Phe Cys 
™* 390 395 



Asp Lys Gly Ala Glu Arg Lys Met Arg Asp Asp Glu Arg Lys Gin Phe 
405 410 

Arg Arg Lys Val Lys Cys Pro Asp Ser Ser Asn Asn Ala Gly He Lys 
420 425 

Gly Cys Leu Leu Ser Gly Phe Arg Gly Asn Glu Thr Thr Tyr Leu Arg 
435 440 445 

Pro Glu Thr Asp Leu Glu Thr Gin Pro Val Leu Phe He Pro Asn Leu 



450 



455 4 60 



His Phe Ser Ser Leu Gin Arg Pro Gly Gly Val Val Pro Ser Ala Gly 



465 



470 4 75 480 



His Ser Ser Ser Asp Arg Leu Pro Leu Lys Arg Thr Cys Ser Pro Phe 

Ala Glu Glu Phe Glu Pro Leu Pro Ser Lys Gin Ala Lys Glu Asp Asp 
500 505 

Leu Gin Arg Val Leu Leu Tyr Val Arg Arg Glu Thr Glu Glu Val Phe 
515 520 525 



Asp Ala Leu Met Leu Lys Thr Pro Asp Leu Lys Gly Leu Arg Asn Ala 
P 530 535 540 

He Ser Glu Lys Tyr Gly Leu Pro Glu Glu Asn He Cys Lys Val Tyr 
545 550 555 

Lys Lys Cys Lys Arg Gly He Leu Val Asn Met Asp Asn Asn He He 



565 
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Gin His Tyr Ser Asn His Val Ala Phe Leu Leu Asp Met Gly Glu Leu 
580 585 590 

Asp Gly Lys lie Gin lie lie Leu Lys Glu Leu 
595 600 

<210> 17 

<211> 4840 

<212> DNA 

<213> drosphila 

<400> 17 



aaaaatagaa 


aaaacaacaa 


caaattggct 


tgaaaacgca 


aatgccaggc 


gcaacgcccc 


60 


cgaaccgacc 


cgccccctca 


acttttgcgc 


cctccagtag 


c a at age age 


aatatgagca 


120 


gcagcaacat 


caaatgttag 


gccaaaatgc 


acaaaccgcc 


agcaacaaag 


gcagcaccaa 


180 


gcgaacgaaa 


caacaacagc 


tccacatacc 


acaaagagtg 


gcacattaga 


ageggecaaa 


240 


agcagccagc 


cgagagcatt 


gtgtaagcca 


aaggcccaga 


gagecagget 


aaaagccccc 


300 


agacgcacaa 


caacaacaac 


aacaactaaa 


acagcacaaa 


gagtggcgaa 


aggtgcaccc 


360 


accagcaaaa 


cagcaacaac 


ggagcaacca 


acaacagcag 


cagcagcagc 


agcagccaca 


420 


tttcagfctac 


agctccagac 


tcccaggttg 


cagactccca 


aagcaaacag 


actccagtcc 


480 


aegatccagc 


tccagttcca 


ccgatccgat 


ccactgctcc 


agegtgeteg 


agtgccatag 


540 


atcctcacca 


agtgccaaaa 


tccgcatcct 


gatcccaaga 


gctcaaggca 


ccccggccca 


600 


aaattgagct 


gagaacgaaa 


cgaaggaagt 


tccttagtgc catagaaagc agttaatgaa 


660 


acaacgacta 


agacgaagat 


cgaccatcca 


gaaccggagg 


gagctaattg 


cgaacgaaag 


720 


aaaccacaaa 


gtgccttcca 


tcaatccgtt 


gataagtgat 


atttattatg 


tttatacttg 


780 


ccagcagccg 


aggcagcaac 


agcaatagca 


acaaccatag 


gggatcaegg 


catcgatgat 


840 


cagtccacga 


ccaagtccta 


gtgcaatccg 


gaatccagtt 


caaattagtt 


caataagecg 


900 


tatctaccac 


gtataatgtc 


cacatccacc 


gccacaacga 


gcgttatcac 


gtccaacgag 


960 


ctctcgctgt 


ccggccacgc 


ccacggtcac 


ggtcacgccc 


accagttgea 


ccagcacacc 


1020 


cacagccgcc 


taggagttgg 


cgttggtgtt 


ggcatcctta 


gcgacgcatc 


cctatcgccc 


1080 


atccaacaag 


gcagtggcgg 


ccacagcggc 


ggaggtaaca 


caaacagttc 


accactggcg 


1140 


cccaacggag 


tgccacttct 


cacaacaatg 


caccgatcac 


cggactcacc 


geagecagaa 


1200 


ttggccacca 


tgacgaacgt 


caacgtgctg 


gatctgcaca 


eggataaetc 


caagctgtac 


1260 
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aacaaaaaaci i 


ctgtatttat 


atacgaaacg 


cccaaggtgg 


tgatgccagc < 


ggatggcggg 


1320 


aatcrcrcaata 


attccgatga 


aggtcatgcc 


atcgatgcgc 


ggattgcggc 


ccaaatgggc 


1380 


aarcaaaccc 


agcaacagca 


gcagcagcaa 


caqcaqacqcr 


aacaccagcc 


gctggccaag 


1440 


atcaaattca 


atgagaacca 


gataatccgg 


Qtacrtaqqac 

3 33 % **333 w,w 


caaatggcga 


gcaacagcaa 


1500 




gggagafccafc 


caataQQQacr 

^»awyy3y»3 


catcatatcc 


tgtcgcgaaa 


cgaggctggt 


1560 


craararattc 


tcacacggat 


cgtcagtgat 


ccctccaagt 


tgatgcccaa 


tgacaatgca 


1620 


ataaccacaa 


ccatgtacaa 


ccaggcccaa 


aagatgaaca 


atgatcacgg 


gcaggcggta 


1680 


l*ai*raaacat 


caccattgcc 


gctagacgcg 


tctgtattgc 


attatagtgg 


cggcaatgat 


1740 


i~ raaatataa 


fcfcaaQaccrcra 


ggccgatatc 


tacgaggatc 


acaagaaaca 


tgcggctgca 


1800 


araaraacta 


ctcrccaacQQ 


aggatccatc 


atatacacca 


catccgatcc 


gaacggagtg 


1860 


aatataaaac 


aactgcccca 


tttgacggta 


ccccaaaaac 


ttgatcccga 


cctctatcaa 


1920 




atataaattt 


gatctacaac 


gatggcagca 


agacggtgat 


ttactccact 


1980 


acacrat eacr a 

ClV^^jy W. w W dy 


acr act 1 1 1 acr a 


aatatactcg 


ggcggcgaca 


tcggcagcct 


ggtgtccgac 


2040 


rr rrr» r» ^ 3 cr i~ crcr 


fcacr fcccacrcrc 


cracia c fccr c cq 
yyy a '" 3 ^^-^j 


tatgccacca 


ccaccggagc 


cggcggccag 


2100 


> — V y LUl>ClLCl 




cacrtcrccfctcr 
^yy *-3 wv * v * ^3 


ccagcgggag 


tcgaggagca 


tctgcagagt 


2160 




ataaccaoac 


cacacctatc 


gatgtctctg gcctatcgca 


aaatgagatt 


2220 


caacrcictttt 

v.. aciiy y w u u k w 


tgctcggctc 


acacccctcg 


tcatcggcga 


cggtaagcac 


aaccggcgtt 


2280 


atctccacaa 


caacgatctc 


gcatcaccag 


caacagcagc 


agcagcagca 


acagcaacag 


2340 


caacaacaac 


agcagcaaca 


ccagcagcag 


cagcaacatc 


ccggcgacat 


tgttagtgcc 


2400 






ctccattgtc 


tcctctgcgg 


cgcaacagca 


gcagcagcag 


2460 


caactaatta 


gcatcaaacg 


agagcccgaa 


gacttgcgca 


aggatcccaa 


gaatggcaac 


2520 


attgccggtg 


cagcaacagc 


aaatggaccc 


ggttcggtca 


taacccaaaa 


gtccttcgat 


2580 


tatacggaat 


tgtgccagcc 


gggcacgctg 


atcgatgcca 


atggcagcat 


acccgtcagc 


2640 


gtgaacagca 


tccagcagag 


aacggcggtc 


catggcagcc 


agaacagtcc 


caccacatcg 


2700 


ctggtggaca 


ccagcaccaa 


tggatccacg 


cgatcgcggc 


cctggcacga 


ctttggacgt 


2760 


cagaatgatg 


ccgacaaaat 


acaaatacca 


aaaatcttca 


caaacgtggg 


cttccgatat 


2820 


cacctggaga 


gccccatcag 


ttcatcgcag 


aggcgcgagg 


acgatcgcat 


cacctacatc 


2880 


aacaagggtc 


aattctatgg 


aataacgctg 


gagtatgtgc 


acgatgcgga 


aaagcccatt 


2940 
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aagaacacca ccgtcaagag tgtgatcatg ctaatgttcc gcgaggagaa gagtcccgag 
gatgagatca aggcctggca attctggcac agtcgtcagc attccgtgaa gcagagaatc 3060 
ttggatgcag atacgaagaa ctcggttggc ctcgttggct gcatcgagga agtgtcgcac 
aatgccatcg ccgtctactg gaatccgctg gagagctccg ccaagatcaa cattgcggtt 
cagtgcttga gcacggattt cagcagtcaa aagggaggcc tgccgctgca cgtacaaato 
gacacatttg aggaccccag agatacggcg gtcttccacc gcggctactg tcagataaag 
gtcttctgcg ataagggcgo cgaacgaaag acgcgcgatg aagagcggcg ggccgccaaa 
cgaaagatga cagccacggg cagaaagaag ctggacgagc tttaccatcc ggtaacggat 
cggtccgagt tctatggcat gcaggacttc gccaagccgc cggtgctatt ctcgcccgcc 
gaggacatgg agaaggtagg tcagctgggc attggcgctg ccaccggcat gacattcaac 
cccctgagca acggcaactc caactccaac tcgcactcgt ccttgcagag cttctacggc 
catgagactg actcgccgga cctgaagggg gcctcaccgt tcctgctcca cggccagaag 
gtggccacgc cgacgctcaa gttccacaac cattttccgc ccgacatgca gaccgataag 
aaggatcaca tactggacca gaacatgttg accagcacac ccctgaccga ctttggtccg 
ccgatgaagc gcggcaggat gacgccgccg acctcggaac gcgtgatgct gtacgtgcgg 
caggagaacg aggaggtgta tacaccgttg cacgtggtgc cgcccaccac gatcggcctg 
ctaaatgcga ttgaaaacaa atacaaaatc tcaacaacga goataaataa catttatcgc 
acaaacaaga aggggattac tgcgaaaatt gacgatgaca tgatatcgtt ctactgcaac 
gaggacatct ttctgctgga ggtgcaacag atcgaggacg acctgtacga tgtgacgctc 
acggagctgc ccaatcagta gcgctggcag tacgggtagc acccgctaac cgcactcaaa 
aaaaaaagca aacaaacaca caaattacgg acacaacaag ttgtttcaat aagccatttt 
ccatagagcc taagtctaaa tatcgtagtt ataataatgg gatccgcaac aaatcgagtt 
gcaacgaatg ttaagaacgc taacacaata cgcatgtaaa atgatacttt aaaattgatt 
tagttatttt agcaacaatg agattatcta aaattgtttg atcaaatttt acattctcgc 
tatgtctata gataattcta agcccgtaag cccataagcg taatcgtaat cgtaatcgta 
ccgtgtattt atgctcatat ataaacaact atatatatat atatatatat atatatgtgc 
ggagtgcaac agtgtctgtc cagtaggaga taagtctcgt ttccgctccc ctgcttatgc 
tatgacctta ggtccagggc aagtatgagt taocgaatct atctattagg tgcatctaac 
gaaaggaatc attagctctg cacgaactct agccgtagcc tattgtaatc catttgtatg 



3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 
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tttggcttaa gcgttttact tgttgaatat aaagtgtaaa attatttttg aaaaaaaaaa 4740 
acccacacaa aacacaaatc gtttgttcta tatttctgtt tcaaaactaa ctcgttaccc 480 0 
acaatcccct ctgttatgta taattaggat ctctgtacac 484 0 

<210> 18 

<211> 1061 

<212> PRT 

<213> murine 

<400> 18 

Met Ser Thr Ser Thr Ala Thr Thr Ser Val lie Thr Ser Asn Glu Leu 
15 10 15 

Ser Leu Ser Gly His Ala His Gly His Gly His Ala His Gin Leu His 
20 25 30 

Gin His Thr His Ser Arg Leu Gly Val Gly Val Gly Val Gly He Leu 
35 40 45 

Ser Asp Ala Ser Leu Ser Pro He Gin Gin Gly Ser Gly Gly His Ser 
50 55 60 

Gly Gly Gly Asn Thr Asn Ser Ser Pro Leu Ala Pro Asn Gly Val Pro 
65 70 75 80 

Leu Leu Thr Thr Met His Arg Ser Pro Asp Ser Pro Gin Pro Glu Leu 
85 90 95 

Ala Thr Met Thr Asn Val Asn Val Leu Asp Leu His Thr Asp Asn Ser 
100 105 110 

Lys Leu Tyr Asp Lys Glu Ala Val Phe He Tyr Glu Thr Pro Lys Val 
115 120 125 

Val Met Pro Ala Asp Gly Gly Gly Gly Asn Asn Ser Asp Glu Gly His 
130 135 140 

Ala He Asp Ala Arg He Ala Ala Gin Met Gly Asn Gin Ala Gin Gin 
145 150 155 160 



Gin Gin Gin Gin Gin Gin Gin Thr Glu His Gin Pro Leu Ala Lys He 
165 170 175 
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Glu Phe Asp Glu Asn Gin He He Arg Val Val Gly Pro Asn Gly Glu 
180 185 190 



Gin Gin Gin He He Ser Arg Glu He He Asn Gly Glu His His He 
195 200 205 



Leu Ser Arg Asn Glu Ala Gly Glu His He Leu Thr Arg He Val Ser 
210 215 220 



Asp Pro Ser Lys Leu Met Pro Asn Asp Asn Ala Val Ala Thr Ala Met 
225 230 235 240 



Tyr Asn Gin Ala Gin Lys Met Asn Asn Asp His Gly Gin Ala Val Tyr 
245 250 255 



Gin Thr Ser Pro Leu Pro Leu Asp Ala Ser Val Leu His Tyr Ser Gly 
260 265 270 



Gly Asn Asp Ser Asn Val He Lys Thr Glu Ala Asp He Tyr Glu Asp 
275 280 285 



His Lys Lys His Ala Ala Ala Ala Ala Ala Ala Ala Gly Gly Gly Ser 
290 295 300 



He He Tyr Thr Thr Ser Asp Pro Asn Gly Val Asn Val Lys Gin Leu 
305 310 315 320 



Pro His Leu Thr Val Pro Gin Lys Leu Asp Pro Asp Leu Tyr Gin Ala 
325 330 335 



Asp Lys His He Asp Leu He Tyr Asn Asp Gly Ser Lys Thr Val He 
340 345 350 



Tyr Ser Thr Thr Asp Gin Lys Ser Leu Glu He Tyr Ser Gly Gly Asp 
355 360 365 



He Gly Ser Leu Val Ser Asp Gly Gin Val Val Val Gin Ala Gly Leu 
370 375 380 



Pro Tyr Ala Thr Thr Thr Gly Ala Gly Gly Gin Pro Val Tyr He Val 
385 390 395 400 
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Ala Asp Gly Ala Leu Pro Ala Gly Val Glu Glu His Leu Gin Ser Gly 
405 410 415 

Lys Leu Asn Gly Gin Thr Thr Pro lie Asp Val Ser Gly Leu Ser Gin 
420 425 430 

Asn Glu lie Gin Gly Phe Leu Leu Gly Ser His Pro Ser Ser Ser Ala 
435 440 445 

Thr Val Ser Thr Thr Gly Val Val Ser Thr Thr Thr lie Ser His His 
450 455 460 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
465 470 475 480 

Gin His Gin Gin Gin Gin Gin His Pro Gly Asp lie Val Ser Ala Ala 
" ' 485 490 495 

Gly Val Gly Ser Thr Gly Ser lie Val Ser Ser Ala Ala Gin Gin Gin 
500 505 510 

Gin Gin Gin Gin Leu lie Ser lie Lys Arg Glu Pro Glu Asp Leu Arg 
515 520 525 

Lys Asp Pro Lys Asn Gly Asn He Ala Gly Ala Ala Thr Ala Asn Gly 
530 ' 535 540 

Pro Glv Ser Val He Thr Gin Lys Ser Phe Asp Tyr Thr Glu Leu Cys 
y ccn 555 560 

545 550 s:>:5 

Gin Pro Gly Thr Leu He Asp Ala Asn Gly Ser He Pro Val Ser Val 
565 570 575 

Asn Ser He Gin Gin Arg Thr Ala Val His Gly Ser Gin Asn Ser Pro 
580 585 590 

Thr Thr Ser Leu Val Asp Thr Ser Thr Asn Gly Ser Thr Arg Ser Arg 
595 600 605 

Pro Trp His Asp Phe Gly Arg Gin Asn Asp Ala Asp Lys He Gin He 
610 ~ 615 620 
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Pro Lys He Phe Thr Asn Val Gly Phe Arg Tyr His Leu Glu Ser Pro 
625 630 635 640 

He Ser Ser Ser Gin Arg Arg Glu Asp Asp Arg He Thr Tyr He Asn 
645 650 655 

Lys Gly Gin Phe Tyr Gly He Thr Leu Glu Tyr Val His Asp Ala Glu 
660 665 670 

Lys Pro He Lys Asn Thr Thr Val Lys Ser Val He Met Leu Met Phe 
675 680 685 

Arg Glu Glu Lys Ser Pro Glu Asp Glu He Lys Ala Trp Gin Phe Trp 
690 695 700 

His Ser Arg Gin His Ser Val Lys Gin Arg He Leu Asp Ala Asp Thr 
705 710 715 720 

Lys Asn Ser Val Gly Leu Val Gly Cys He Glu Glu Val Ser His Asn 
725 730 735 

Ala He Ala Val Tyr Trp Asn Pro Leu Glu Ser Ser Ala Lys He Asn 
740 745 750 

He Ala Val Gin Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly Gly 
755 - 760 . 765 

Leu Pro Leu His Val Gin He Asp Thr Phe Glu Asp Pro Arg Asp Thr 
770 775 780 

Ala Val Phe His Arg Gly Tyr Cys Gin He Lys Val Phe Cys Asp Lys 
785 790 795 800 

Gly Ala Glu Arg Lys Thr Arg Asp Glu Glu Arg Arg Ala Ala Lys Arg 
805 810 815 

Lys Met Thr Ala Thr Gly Arg Lys Lys Leu Asp Glu Leu Tyr His Pro 
820 825 830 

Val Thr Asp Arg Ser Glu Phe Tyr Gly Met Gin Asp Phe Ala Lys Pro 
835 " 840 845 

Pro Val Leu Phe Ser Pro Ala Glu Asp Met Glu Lys Val Gly Gin Leu 
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850 855 860 

Gly lie Gly Ala Ala Thr Gly Met Thr Phe Asn Pro Leu Ser Asn Gly 
865 870 875 880 

Asn Ser Asn Ser Asn Ser His Ser Ser Leu Gin Ser Phe Tyr Gly His 
885 890 895 

Glu Thr Asp Ser Pro Asp Leu Lys Gly Ala Ser Pro Phe Leu Leu His 
900 905 910 

Gly Gin Lys Val Ala Thr Pro Thr Leu Lys Phe His Asn His Phe Pro 
915 920 925 

Pro Asp Met Gin Thr Asp Lys Lys Asp His lie Leu Asp Gin Asn Met 
930 935 940 

Leu Thr Ser Thr Pro Leu Thr Asp Phe Gly Pro Pro Met Lys Arg Gly 
945 950 955 960 

Arg Met Thr Pro Pro Thr Ser Glu Arg Val Met Leu Tyr Val Arg Gin 
965 970 " 975 

Glu Asn Glu Glu Val Tyr Thr Pro Leu His Val Val Pro Pro Thr Thr 
980 985 990 

lie Gly Leu Leu Asn Ala lie Glu Asn Lys Tyr Lys lie Ser Thr Thr 
995 1000 1005 

Ser lie Asn Asn lie Tyr Arg Thr Asn Lys Lys Gly lie Thr Ala 
1010 1015 " 1020 

Lys lie Asp Asp Asp Met lie Ser Phe Tyr Cys Asn Glu Asp lie 
1025 1030 1035 

Phe Leu Leu Glu Val Gin Gin lie Glu Asp Asp Leu Tyr Asp Val 
1040 1045 " 1050 



Thr Leu Thr Glu Leu Pro Asn Gin 
1055 1060 



<210> 19 
<211> 21 
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<212> DNA 
<213> human 

<400> 19 

gaagtctttg atgccctgat g 



<210> 20 

<211> 21 

<212> DNA 

<213> human 

<220> 

<221> misc_f eature 

<223> human p49 mgr 



<400> 20 

aacccattcc ctcgacatag a 



<210> 21 

<211> 20 

<212> DNA 

<213> human 

<400> 21 

agcgcgatga cacaggagta 



<210> 22 

<211> 20 

<212> DNA 

<213> human 

<400> 22 

cgttgctatg gagacagtga 



<210> 23 

<211> 20 

<212> DNA 

<213> human 

<400> 23 

ccgtttaaca aggacactgc 



<210> 24 

<211> 20 

<212> DNA 

<213> murine 



<400> 24 

ctggaagcca ccaaatctct 



20 
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<210> 25 

<211> 20 

<212> DNA 

<213> murine 

<400> 25 

agcgcgatga cacaggagta 



<210> 26 

<211> 20 

<212> DNA 

<213> murine 

<400> 26 

agtgccagag ctgaactgat 



20 



20 



<210> 27 

<211> 20 

<212> DNA 

<213> murine 

<400> 27 

tccatgggtt ccttgagttc 



<210> 28 

<211> 20 

<212> DNA 

<213> murine 

<400> 28 

agtgccagag ctgaactgat 



<210> 29 

<211> 20 

<212> DNA 

<213> murine 

<400> 29 

aaaggggagc gagttcattg 



20 



20 



<210> 30 

<211> 20 

<212> DNA 

<213> murine 

<400> 30 

agagctctcg gtgatggata 



<210> 
<211> 
<212> 



31 
34 
DNA 



WO 2004/015108 



T/AU2003/001006 



34 



56 



<213> drosophila dopa decarboxylase promoter 
<400> 31 

ggtggtgctc taataaccgg tttccaagat gcgc 

<210> 32 
<211> 34 
<212> DNA 

<213> drosophila PCNA promoter 
<400> 32 

gggtaaaaag tgtgaacaat caaaccagtt ggca 34 

<210> 33 

<211> 84 

<212> DNA 

<213> human 

<400> 33 

ggacacacac ccaaacccac acccacccac aaacacacaa accggcagtg acaacaacca 60 
cccatccttc aataacagca acca 84 

<210> 34 

<211> 4747 

<212> DNA 

<213> Drosophila 

<400> 34 

aaaaatagaa aaaacaacaa caaattggct tgaaaacgca aatgccaggc gcaacgcccc 60 

cgaaccgacc cgccccctca acttttgcgc cctccagtag caatagcagc aatatgagca 120 

gcagcaacat caaatgttag gccaaaatgc acaaaccgcc agcaacaaag gcagcaccaa 180 

gcgaacgaaa caacaacagc tccacatacc acaaagagtg gcacattaga agcggccaaa 240 

agcagccagc cgagagcatt gtgtaagcca aaggcccaga gagccaggct aaaagccccc 300 

agacgcacaa caacaacaac aacaactaaa acagcacaaa gagtggcgaa aggtgcaccc 360 

accagcaaaa cagcaacaac ggagcaacca acaacagcag cagcagcagc agcagccaca 42 0 

tttcagttac agctccagac tcccaggttg cagactccca aagcaaacag actccagtcc 48 0 

acgatccagc tccagttcca ccgatccgat ccactgctcc agcgtgctcg agtgccatag 54 0 

atcctcacca agtgccaaaa tccgcatcct gatcccaaga gctcaaggca ccccggccca 600 

aaattgagct gagaacgaaa cgaaggaagt tccttagtgc catagaaagc agttaatgaa 660 

acaacgacta agacgaagat cgaccatcca gaaccggagg gagctaattg cgaacgaaag 720 

aaaccacaaa gtgccttcca tcaatccgtt gataagtgat atttattatg tttatacttg 780 
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ccagcagccg 


aggcagcaac 


agcaatagca 


acaaccatag 


gggatcacgg 


catcgatgat 


840 


cagtccacga 


ccaagtccta 


gtgcaatccg 


gaatccagtt 


caaattagtt 


caataagccg 


900 


tatctaccac 


gtataatgtc 


cacatccacc 


gccacaacga 


gcgttatcac 


gtccaacgag 


960 


ctctcgctgt 


ccggccacgc 


ccacggtcac 


ggtcacgccc 


accagttgca 


ccagcacacc 


1020 


cacagccgcc 


taggagttgg 


cgttggtgtt 


ggcatcctta 


gcgacgcatc 


cctatcgccc 


1080 


atccaacaag 


qcaqtqqcqq 


ccacagcggc 


ggaggtaaca 


caaacagttc 


accactggcg 


1140 


cccaacggag 


tgccacttct 


cacaacaatg 


caccgatcac 


cggactcacc 


gcagccagaa 


1200 


ttggccacca 


tgacgaacgt 


caacgtgctg 


gatctgcaca 


cggataactc 


caagctgtac 


1260 


qacaaqqaqq 

_J 33 ZJ ZJ ZJ 


ctgtatttat 


atacgaaacg 


cccaaggtgg 


tgatgccagc 


qqatqqcqqq 


1320 


ggtggcaata 


attccgatga 


aggtcatgcc 


atcgatgcgc 


qqattqcqqc 

ZJ ZJ w 3~33 w " 


ccaaatgggc 


1380 


aaccaagccc 


agcaacagca 


gcagcagcaa 


caqcaqacqq 


aacaccagcc 


qctqqccaaq 

ZJ _3_3 3 


1440 


atcqaqttcq 


atgagaacca 


gataatccgg 


qtqqtqqqac 


caaatqqcqa 


gcaacagcaa 


1500 


atcatctcgc 


gggagatcat 


c a a t qq q cracr 


cat cat at cc 


tgtcgcgaaa 




1560 


gagcacattc 


tcacacggat 


cgtcagtgat 


ccctccaagt 


tgatgcccaa 


tgacaatgca 


1620 


qtQQCCaCQQ 


ccatgtacaa 


ccaggcccaa 


aagatgaaca 


atgatcacgg 


qcaqqcqqta 


1680 


tatcagacat 


caccattgcc 


gctagacgcg 


tctgtattgc 


attatagtgg 


cggcaatgat 


1740 


tcgaatgtaa 


ttaagacgga 


ggccgatatc 


tacgaggatc 


acaagaaaca 


tqcqqctqca 


1800 


gcagcagctg 


ctgccggcgg 


aggatccatc 


atatacacca 


catccgatcc 


qaacqqaqtq 

ZJ Zj ZJ ZJ ZJ 


1860 


aatgtgaaac 


aactgcccca 


tttgacggta 


ccccaaaaac 


ttgatcccga 

ZJ ZJ 


cctctatcaa 


1920 


gccgataagc 


atatagattt 


gatctacaac gatggcagca 


agacggtgat 


ttactccact 


1980 


acggatcaga 


agagtttgga 


aatatactcg 


ggcggcgaca 


tcggcagcct 


qqtqtccqac 

ZJ ZJ ZJ Z3 


2040 


ggccaagtgg 


tggtccaggc gggactgccg tatgccacca 


ccaccggagc 


cggcggccag 


2100 


cccgtctata 


tcgtggccga 


cggtgccttg 


ccagcgggag 


tcgaggagca 


tctgcagagt 


2160 


ggaaagctca 


atggccagac 


cacacctatc 


gatgtctctg 


gcctatcgca 


aaatgagatt 


2220 


caaggctttt 


tgctcggctc 


acacccctcg 


tcatcggcga 


cggtaagcac 


aaccggcgtt 


2280 


gtctccacga 


caacgatctc 


gcatcaccag 


caacagcagc 


agcagcagca 


acagcaacag 


2340 


cagcagcagc 


agcagcaaca 


ccagcagcag 


cagcaacatc 


ccggcgacat 


tgttagtgcc 


2400 


gctggcgtgg 


ggagcacggg 


ctccattgtc 


tcctctgcgg 


cgcaacagca 


gcagcagcag 


2460 
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caactaatta 


gcatcaaacg 


agagcccgaa 


gacttgcgca 


aggatcccaa 


gaatggcaac 


2520 


attgccggtg 


cagcaacagc 


aaatggaccc 


ggttcggtca 


taacccaaaa 


gtcctttgat 


2580 


tatacggaat 


tgtgccagcc 


gggcacgctg 


atcgatgcca 


atggcagcat 


acccgtcagc 


2640 


gtgaacagca 


tccagcagag 


aacggcggtc 


catggcagcc 


agaacagtcc 


caccacatcg 


2700 


ctggtggaca 


ccagcaccaa 


tggatccacg 


cgatcgcggc 


cctggcacga 


ctttggacgt 


2760 


cagaatgatg 


ccgacaaaat 


acaaatacca 


aaaatcttca 


caaacgtggg 


cttccgatat 


2820 


cacctggaga 


gccccatcag 


ttcatcgcag 


aggcgcgagg 


acgatcgcat 


cacctacatc 


2880 


aacaagggtc 


aattctatgg 


aataacgctg 


gagtatgtgc 


acgatgcgga 


aaagcccatt 


2940 


aagaacacca 


ccgtcaagag 


tgtgatcatg 


ctaatgttcc 


gcgaggagaa 


gagtcccgag 


3000 


gatgagatca 


aggcctiggca 


attctggcac 


agtcgtcagc 


attccgtgaa 


gcagagaatc 


3060 


ttggatgcag 


atacgaagaa 


ctcggttggc 


ctcgt.tggct 


gcatcgagga 


agtgtcgcac 


3120 


aatgccatcg 


ccgtctactg gaatccgctg gagagctccg 


ccaagatcaa 


cattgcggtt 


3180 


cagtgcttga 


gcacggattt 


cagcagtcaa 


aagggaggcc 


tgccgctgca 


cgtacaaatc 


3240 


gacacatttg 


aggaccccag 


agatacggcg 


gtcttccacc 


gcggctactg 


tcagataaag 


3300 


gtcttctgcg 


ataagggcgc 


cgaacgaaag 


acgcgcgatg 


aagagcggcg 


ggccgccaaa 


3360 


cgaaagatga 


cagccacggg 


cagaaagaag 


ctggacgagc 


tttaccatcc 


ggtaacggat 


3420 


cggtccgagt 


tctatggcat 


gcaggacttc 


gccaagccgc 


cggtgctatt 


ctcgcccgcc 


3480 


gaggacatgg 


agaagagctt 


ctacggccat 


gagactgact 


cgccggacct 


gaagggggcc 


3540 


tcaccgttcc 


tgctccacgg 


ccagaaggtg 


gccacgccga 


cgctcaagtt 


ccacaaccat 


3600 


tttccgcccg 


acatgcagac 


cgataagaag 


gatcacatac 


tggaccagaa 


catgttgacc 


3660 


agcacacccc 


tgaccgactt 


tggtccgccg 


atgaagcgcg 


gcaggatgac 


gccgccgacc 


3720 


tcggaacgcg 


tgatgctgta 


cgtgcggcag 


gagaacgagg 


aggtgtatac 


accgttgcac 


3780 


gtggtgccgc 


ccaccacgat 


cggcctgcta aatgcgattg 


aaaacaaata 


caaaatctca 


3840 


acaacgagca 


taaataacat 


ttatcgcaca 


aacaagaagg 


ggattactgc 


gaaaattgac 


3900 


gatgacatga 


tatcgttcta 


ctgcaacgag gacatctttc 


tgctggaggt 


gcaacagatc 


3960 


gaggacgacc 


tgtacgatgt 


gacgctcacg 


gagctgccca 


atcagtagcg 


ctggcagtac 


4020 


gggtagcacc 


cgctaaccgc 


actcaaaaaa 


aaaagcaaac 


aaacacacaa 


attacggaca 


4080 


caacaagttg 


tttcaataag 


ccattttcca 


tagagcctaa 


gtctaaatat 


cgtagttata 


4140 


ataatgggat 


ccgcaacaaa 


tcgagttgca 


acgaatgtta 


agaacgctaa 


cacaatacgc 


4200 
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atgtaaaatg atactttaaa attgatttag ttattttagc aacaatgaga ttatctaaaa 4260 

ttgtttgatc aaattttaca ttctcgctat gtctatagat aattctaagc ccgtaagccc 4320 

ataagcgtaa tcgtaatcgt aatcgtaccg tgtatttatg ctcatatata aacaactata 4380 

tatatatata tatatatata tatgtgcgga gtgcaacagt gtctgtccag taggagataa 444 0 

gtctcgtttc cgctcccctg cttatgctat gaccttaggt ccagggcaag tatgagttac 4500 

cgaatctatc tattaggtgc atctaacgaa aggaatcatt agctctgcac gaactctagc 4560 

cgtagcctat tgtaatccat ttgtatgttt ggcttaagcg ttttacttgt tgaatataaa 4 62 0 

gtgtaaaatt atttttgaaa aaaaaaaacc cacacaaaac acaaatcgtt tgttctatat 4680 

ttctgtttca aaactaactc gttacccaca atcccctctg ttatgtataa ttaggatctc 4740 
tgtacac 



<210> 35 

<211> 1030 

<212> PRT 

<213> Drosophila 

<400> 35 

Met Ser Thr Ser Thr Ala Thr Thr Ser Val lie Thr Ser Asn Glu Leu 
15 10 15 



Ser Leu Ser Gly His Ala His Gly His Gly His Ala His Gin Leu His 
20 25 30 



Gin His Thr His Ser Arg Leu Gly Val Gly Val Gly Val Gly lie Leu 
35 40 45 



Ser Asp Ala Ser Leu Ser Pro lie Gin Gin Gly Ser Gly Gly His Ser 
50 55 60 



Gly Gly Gly Asn Thr Asn Ser Ser Pro. Leu Ala Pro Asn Gly Val Pro 
65 " 70 75 80 



Leu Leu Thr Thr Met His Arg Ser Pro Asp Ser Pro Gin Pro Glu Leu 
85 90 95 



4747 



Ala Thr Met Thr Asn Val Asn Val Leu Asp Leu His Thr Asp Asn Ser 
100 105 HO 
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Lys Leu Tyr Asp Lys Glu Ala Val Phe lie Tyr Glu Thr Pro Lys Val 
115 120 125 



Val Met Pro Ala Asp Gly Gly Gly Gly Asn Asn Ser Asp Glu Gly His 
130 135 140 



Ala lie Asp Ala Arg He Ala Ala Gin Met Gly Asn Gin Ala Gin Gin 
145 150 155 160 



Gin Gin Gin Gin Gin Gin Gin Thr Glu His Gin Pro Leu Ala Lys He 
165 170 175 



Glu Phe Asp Glu Asn Gin He lie Arg Val Val Gly Pro Asn Gly Glu 
180 185 190 



Gin Gin Gin He He Ser Arg Glu He He Asn Gly Glu His His He 
195 200 205 



Leu Ser Arg Asn Glu Ala Gly Glu His He Leu Thr Arg He Val Ser 
210 215 220 



Asp Pro Ser Lys Leu Met Pro Asn Asp Asn Ala Val Ala Thr Ala Met 
225 230 235 240 



Tyr Asn Gin Ala Gin Lys Met Asn Asn Asp His Gly Gin Ala Val Tyr 
245 250 255 



Gin Thr Ser Pro Leu Pro Leu Asp Ala Ser Val Leu His Tyr Ser Gly 
260 265 270 



Gly Asn Asp Ser Asn Val He Lys Thr Glu Ala Asp He Tyr Glu Asp 
275 280 285 



His Lys Lys His Ala Ala Ala Ala Ala Ala Ala Ala Gly Gly Gly Ser 
290 295 300 



He He Tyr Thr Thr Ser Asp Pro Asn Gly Val Asn Val Lys Gin Leu 
305 310 315 320 



Pro His Leu Thr Val Pro Gin Lys Leu Asp Pro Asp Leu Tyr Gin Ala 
325 330 335 



Asp Lys His He Asp Leu He Tyr Asn Asp Gly Ser Lys Thr Val He 
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340 345 350 



Tyr Ser Thr Thr Asp Gin Lys Ser Leu Glu He Tyr Ser Gly Gly Asp 
355 360 365 



He Gly Ser Leu Val Ser Asp Gly Gin Val Val Val Gin Ala Gly Leu 
370 375 380 



Pro Tyr Ala Thr Thr Thr Gly Ala Gly Gly Gin Pro Val Tyr lie Val 
385 390 395 400 



Ala Asp Gly Ala Leu Pro Ala Gly Val Glu Glu His Leu Gin Ser Gly 
405 410 415 



Lys Leu Asn Gly Gin Thr Thr Pro He Asp Val Ser Gly Leu Ser Gin 
420 425 430 



Asn Glu lie Gin Gly Phe Leu Leu Gly Ser His Pro Ser Ser Ser Ala 
435 440 445 



Thr Val Ser Thr Thr Gly Val Val Ser Thr Thr Thr He Ser His His 
450 455 460 



Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
465 470 475 480 



Gin His Gin Gin Gin Gin Gin His Pro Gly Asp lie Val Ser Ala Ala 
485 490 495 



Gly Val Gly Ser Thr Gly Ser He Val Ser Ser Ala Ala Gin Gin Gin 
500 505 510 



Gin Gin Gin Gin Leu He Ser He Lys Arg Glu Pro Glu Asp Leu Arg 
515 520 525 



Lys Asp Pro Lys Asn Gly Asn He Ala Gly Ala Ala Thr Ala Asn Gly 
530 535 540 



Pro Gly Ser Val He Thr Gin Lys Ser Phe Asp Tyr Thr Glu Leu Cys 
545 550 555 560 



Gin Pro Gly Thr Leu He Asp Ala Asn Gly Ser He Pro Val Ser Val 
565 570 575 
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Asn Ser He Gin Gin Arg Thr Ala Val His Gly Ser Gin Asn Ser Pro 
580 585 590 



Thr Thr Ser Leu Val Asp Thr Ser Thr Asn Gly Ser Thr Arg Ser Arg 
595 600 605 



Pro Trp His Asp Phe Gly Arg Gin Asn Asp Ala Asp Lys He Gin He 
610 615 620 



Pro Lys He Phe Thr Asn Val Gly Phe Arg Tyr His Leu Glu Ser Pro 
625 630 635 640 



He Ser Ser Ser Gin Arg Arg Glu Asp Asp Arg He Thr Tyr He Asn 
645 650 655 



Lys Gly Gin Phe Tyr Gly He Thr Leu Glu Tyr Val His Asp Ala Glu 
660 665 670 



Lys Pro He Lys Asn Thr Thr Val Lys Ser Val He Met Leu Met Phe 
675 680 685 



Arg Glu Glu Lys Ser Pro Glu Asp Glu He Lys Ala Trp Gin Phe Trp 
690 695 700 



His Ser Arg Gin His Ser Val Lys Gin Arg He Leu Asp Ala Asp Thr 
705 710 715 720 



Lys Asn Ser Val Gly Leu Val Gly Cys He Glu Glu Val Ser His Asn 
725 730 735 



Ala He Ala Val Tyr Trp Asn Pro Leu Glu Ser Ser Ala Lys He Asn 
740 745 750 



He Ala Val Gin Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly Gly 
755 760 765 



Leu Pro Leu His Val Gin He Asp Thr Phe Glu Asp Pro Arg Asp Thr 
770 775 780 



Ala Val Phe His Arg Gly Tyr Cys Gin He Lys Val Phe Cys Asp Lys 
785 790 795 800 
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Gly Ala Glu Arg Lys Thr Arg Asp Glu Glu Arg Arg Ala Ala Lys Arg 
805 810 815 



Lys Met Thr Ala Thr Gly Arg Lys Lys Leu Asp Glu Leu Tyr His Pro 
820 825 830 



Val Thr Asp Arg Ser Glu Phe Tyr Gly Met Gin Asp Phe Ala Lys Pro 
835 840 845 



Pro Val Leu Phe Ser Pro Ala Glu Asp Met Glu Lys Ser Phe Tyr Gly 
850 855 860 



His Glu Thr Asp Ser Pro Asp Leu Lys Gly Ala Ser Pro Phe Leu Leu 
865 870 875 880 



His Gly Gin Lys Val Ala Thr Pro Thr Leu Lys Phe His Asn His Phe 
885 890 895 



Pro Pro Asp Met Gin Thr Asp Lys Lys Asp His lie Leu Asp Gin Asn 
900 905 910 



Met Leu Thr Ser Thr Pro Leu Thr Asp Phe Gly Pro Pro Met Lys Arg 
915 920 925 



Gly Arg Met Thr Pro Pro Thr Ser Glu Arg Val Met Leu Tyr Val Arg 
930 935 940 



Gin Glu Asn Glu Glu Val Tyr Thr Pro Leu His Val Val Pro Pro Thr 
945 950 955 960 



Thr lie Gly Leu Leu Asn Ala lie Glu Asn Lys Tyr Lys lie Ser Thr 
965 970 975 



Thr Ser He Asn Asn He Tyr Arg Thr Asn Lys Lys Gly He Thr Ala 
980 985 990 



Lys He Asp Asp Asp Met He Ser Phe Tyr Cys Asn Glu Asp He Phe 
995 1000 1005 



Leu Leu Glu Val Gin Gin He Glu Asp Asp Leu Tyr Asp Val Thr 
1010 1015 1020 
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Leu Thr Glu Leu Pro Asn Gin 
1025 1030 

<210> 36 

<211> 5650 

<212> DNA 

<213> Drosophila 



<400> 36 



aaaaatagaa 


aaaacaacaa 


caaattggct 


tgaaaacgca 


aatgccaggc 


gcaacgcccc 


60 


cgaaccgacc 


cgccccctca 


acttttgcgc 


cctccagtag 


caatagcagc 


aatatgagca 


120 


gcagcaacat 


caaatgttag 


gccaaaatgc 


acaaaccgcc 


agcaacaaag 


gcagcaccaa 


180 


gcgaacgaaa 


caacaacagc 


tccacatacc 


acaaagagtg 


gcacattaga 


agcggccaaa 


240 


agcagccagc 


cgagagcatt 


gtgtaagcca 


aaggcccaga 


gagccaggct 


aaaagccccc 


300 


agacgcacaa 


caacaacaac 


aacaactaaa 


acagcacaaa 


gagtggcgaa 


aggtgcaccc 


360 


accagcaaaa 


cagcaacaac 


ggagcaacca 


acaacagcag 


cagcagcagc 


agcagccaca 


420 


tttcagttac 


agctccagac 


tcccaggttg 


cagactccca 


aagcaaacag 


actccagtcc 


480 


acgatccagc 


tccagttcca 


ccgatccgat 


ccactgctcc 


agcgtgctcg 


agtgccatag 


540 


atcctcacca 


agtgccaaaa 


tccgcatcct 


gatcccaaga 


gctcaaggca 


ccccggccca 


600 


aaattgagct 


gagaacgaaa 


cgaaggaagt 


tccttagtgc 


catagaaagc 


agttaatgaa 


660 


acaacgacta 


agacgaagat 


cgaccatcca 


gaaccggagg 


gagctaattg 


cgaacgaaag 


720 


aaaccacaaa 


gtgccttcca 


tcaatccgtt 


gataagtgat 


atttattatg 


tttatacttg 


780 


ccagcagccg 


aggcagcaac 


agcaatagca 


acaaccatag 


gggatcacgg 


catcgatgat 


840 


cagtccacga 


ccaagtccta 


gtgcaatccg 


gaatccagtt 


caaattagtt 


caataagccg 


900 


tatctaccac 


gtataatgtc 


cacatccacc 


gccacaacga 


gcgttatcac 


gtccaacgag 


960 


ctctcgctgt 


ccggccacgc 


ccacggtcac 


ggtcacgccc 


accagttgca 


ccagcacacc 


1020 


cacagccgcc 


taggagttgg 


cgttggtgtt 


ggcatcctta 


gcgacgcatc 


cctatcgccc 


1080 


atccaacaag 


gcagtggcgg 


ccacagcggc 


ggaggtaaca 


caaacagttc 


accactggcg 


1140 


cccaacggag 


tgccacttct 


cacaacaatg 


caccgatcac 


cggactcacc 


gcagccagaa 


1200 


ttggccacca 


tgacgaacgt 


caacgtgctg 


gatctgcaca 


cggataactc 


caagctgtac 


1260 


gacaaggagg 


ctgtatttat 


atacgaaacg 


cccaaggtgg 


tgatgccagc 


ggatggcggg 


1320 


ggtggcaata 


attccgatga 


aggtcatgcc 


atcgatgcgc 


ggattgcggc 


ccaaatgggc 


1380 


aaccaagccc 


agcaacagca 


gcagcagcaa 


cagcagacgg 


aacaccagcc 


gctggccaag 


1440 
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atcgagttcg 


atgagaacca 


gataatccgg 


gtggtgggac 


caaatggcga 


gcaacagcaa 


1500 


atcatctcgc 


gggagatcat 


caatggggag 


catcatatcc 


tgtcgcgaaa 


cgaqqctqqt 


1560 


gagcacattc 


tcacacggat 


cgtcagtgat 


ccctccaagt 


tgatgcccaa 


tgacaatgca 


1620 


gtggccacgg 


ccatgtacaa 


ccaggcccaa 


aagatgaaca 


atgatcacgg 


qcaqqcqqta 


1680 


tatcagacat 


caccattgcc 


gctagacgcg 


tctgtattgc 


attatagtgg 


cggcaatgat 


1740 


tcgaatgtaa 


ttaagacgga 


ggccgatatc 


tacgaggatc 


acaagaaaca 


tqcqgctqca 


1800 


gcagcagctg 


ctgccggcgg 


aggatccatc 


atatacacca 


catccgatcc 


gaacggagtg 


1860 


aatgtgaaac 


aactgcccca 


tttgacggta 


ccccaaaaac 


ttgatcccga 


cctctatcaa 


1920 


gccgataagc 


atatagattt 


gatctacaac 


gatggcagca 


agacggtgat 


ttactccact 


1980 


acggatcaga 


agagtttgga 


aatatactcg 


ggcggcgaca 


tcggcagcct 


qqtqtccqac 


2040 


ggccaagtgg 


tggtccaggc 


gggactgccg 


tatgccacca 


ccaccggagc 


cqqcqqccaq 


2100 


cccgtctata 


tcgtggccga 


cggtgccttg 


ccagcgggag 


tcqaqqaqca 

ZJ ZJ ZJ ZJ 


tctgcagagt 


2160 


ggaaagctca 


atggccagac 


cacacctatc 


gatgtctctg 


gcctatcgca 


aaatgagatt 


2220 


caaggctttt 


tgctcggctc 


acacccctcg tcatcggcga 


cggtaagcac 


aaccqqcqtt 


2280 


gtctccacga 


caacgatctc 


gcatcaccag 


caacagcagc 


agcagcagca 


acagcaacag 


2340 


cagcagcagc 


agcagcaaca 


ccagcagcag 


cagcaacatc 


ccggcgacat 


tgttagtgcc 


2400 


gctggcgtgg 


ggagcacggg 


ctccattgtc 


tcctctgcgg 


cgcaacagca 


gcaqcaqcaq 


2460 


caactaatta 


gcatcaaacg 


agagcccgaa 


gacttgcgca 


aggatcccaa 


gaatggcaac 


2520 


attgccggtg 


cagcaacagc 


aaatggaccc 


ggttcggtca 


taacccaaaa 


gatcttgcac 


2580 


gtggatgcac 


caacggcaag 


tgaagctgat 


aggcccagca 


cacccagcag 


cagcatcaac 


2640 


agcactgaaa 


acactgaatc 


ggactcacag 


tcagtatcag 


gatcagaatc 


agqatcgccg 

ZJ ZJ ZJ ZJ 


2700 


ggagccagga 


ccacagccac 


actagagatg 


tatgcaacca 


cgggcggcac 


acagatctat 


2760 


ctacagacct 


cacatcccag 


cacggcgagc 


ggagcgggcg 


gcggcgccgg 


acccgctgga 


2820 


gccgccggcg 


gcggcggtgt 


gtccatgcag 


gcgcaaagtc 


ccagtccggg 


tccctatatc 


2880 


acggccaatg 


actatggcat 


gtacacggcc 


agtcgcctgc 


cacccggtcc 


cccgcccacc 


2940 


agcaccacca 


cgtttatagc 


ggagccctcc 


tactatcggg 


aatactttgc 


accggatggc 


3000 


caaggtggct 


atgtgccggc 


cagcacgagg 


tctttgtatg 


gcgacgtgga 


cgtatccgta 


3060 


tctcagcccg 


gcggagtggt 


cacctatgag 


ggccgctttg 


ccggcagcgt 


tcccccgccc 


3120 
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gccaccacca ccgtgctaac cagcgtgcat caccaccagc aacagcagca gcaacaacag 3180 

cagcatcaac agcagcagca gcagcaacag caccaccagc agcaacagca ccattcgcag 3240 

gatggcaaga gcaatggcgg agcaacgcca ctctatgcca aagccattac ggcggcgggt 3300 

ctaacggtgg atttgccaag tccggattcg ggcattggta cggatgccat tacaccgcgg 3360 

gatcagacaa atatccaaca gtcctttgat tatacggaat tgtgccagcc gggcacgctg 3420 

atcgatgcca atggcagcat acccgtcagc gtgaacagca tccagcagag aacggcggtc 3480 

catggcagcc agaacagtcc caccacatcg ctggtggaca ccagcaccaa tggatccacg 3540 

cgatcgcggc cctggcacga ctttggacgt cagaatgatg ccgacaaaat acaaatacca 360 0 

aaaatcttca caaacgtggg cttccgatat cacctggaga gccccatcag ttcatcgcag 3660 

aggcgcgagg acgatcgcat cacctacatc aacaagggtc aattctatgg aataacgctg 372 0 

gagtatgtgc acgatgcgga aaagcccatt aagaacacca ccgtcaagag tgtgatcatg 3780 

ctaatgttcc gcgaggagaa gagtcccgag gatgagatca aggcctggca attctggcac 3840 

agtcgtcagc attccgtgaa gcagagaatc ttggatgcag atacgaagaa ctcggttggc 3900 

ctcgttggct gcatcgagga agtgtcgcac aatgccatcg ccgtctactg gaatccgctg 3960 

gagagctccg ccaagatcaa cattgcggtt ,cagtgcttga gcacggattt cagcagtcaa 4020 

aagggaggcc tgccgctgca cgtacaaatc gacacatttg aggaccccag agatacggcg 40 80 

gtcttccacc gcggctactg tcagataaag gtcttctgcg ataagggcgc cgaacgaaag 4140 

acgcgcgatg aagagcggcg ggccgccaaa cgaaagatga cagccacggg cagaaagaag 42 00 

ctggacgagc tttaccatcc ggtaacggat cggtccgagt tctatggcat gcaggacttc 4260 

gccaagccgc cggtgctatt ctcgcccgcc gaggacatgg agaaggtagg tcagctgggc 4320 

attggcgctg ccaccggcat gacattcaac cccctgagca acggcaactc caactccaac 43 80 

tcgcactcgt ccttgcagag cttctacggc catgagactg actcgccgga cctgaagggg 4440 
gcctcaccgt tcctgctcca cggccagaag gtggccacgc cgacgctcaa gttccacaac 4500 
cattttccgc ccgacatgca gaccgataag aaggatcaca tactggacca gaacatgttg 4560 
accagcacac ccctgaccga ctttggtccg ccgatgaagc gcggcaggat gacgccgccg 4620 
acctcggaac gcgtgatgct gtacgtgcgg caggagaacg aggaggtgta tacaccgttg 468 0 
cacgtggtgc cgcccaccac gatcggcctg ctaaatgcga ttgaaaacaa atacaaaatc 4740 
tcaacaacga gcataaataa catttatcgc acaaacaaga aggggattac tgcgaaaatt 480 0 
gacgatgaca tgatatcgtt ctactgcaac gaggacatct ttctgctgga ggtgcaacag 4860 
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atcgaggacg 


acctgtacga 


tgtgacgctc 


acggagctgc 


ccaatcagta 


gcgctggcaq 


4920 


tacgggtagc 


acccgctaac 


cgcactcaaa 


aaaaaaagca 


aacaaacaca 


caaattacgg 


4980 


acacaacaag 


ttgtttcaat 


aagccatttt 


ccatagagcc 


taagtctaaa 


tatcgtagtt 


5040 


ataataatgg 


gatccgcaac 


aaatcgagtt 


gcaacgaatg 


ttaagaacgc 


taacacaata 


5100 


cgcatgtaaa 


atgatacttt 


aaaattgatt 


tagttatttt 


agcaacaatg 


agattatcta 


5160 


aaattgtttg 


atcaaatttt 


acattctcgc 


tatgtctata 


gataattcta 


acycccqtaao 


5220 


cccataagcg 


taatcgtaat 


cgtaatcgta 


ccgtgtattt 


atgctcatat 


ataaacaac t 


5280 


atatatatat 


atatatatat 


atatatgtgc 


ggagtgcaac 


agtgtctgtc 


cagtaggaga 


5340 


taagtctcgt 


ttccgctccc 


ctgcttatgc 


tatgacctta 


ggtccagggc 


aagtatgagt 


5400 


taccgaatct 


atctattagg 


tgcatctaac 


gaaaggaatc 


attagctctg 


cacgaactct 


5460 


agccgtagcc 


tattgtaatc 


catttgtatg 


tttggcttaa 


gcgttttact 


tgttgaatat 


5520 


aaagtgtaaa 


attatttttg 


aaaaaaaaaa 


acccacacaa 


aacacaaatc 


gtttgttcta 


5580 


tatttctgtt 


tcaaaactaa 


ctcgttaccc 


acaatcccct 


ctgttatgta 


taattaggat 


5640 


ctctgtacac 












5650 



<210> 37 

<211> 1331 • 

<212> PRT 

<213> Drosophila 

<400> 37 

Met Ser Thr Ser Thr Ala Thr Thr Ser Val He Thr Ser Asn Glu Leu 
15 10 15 

Ser Leu Ser Gly His Ala His Gly His Gly His Ala His Gin Leu His 
20 25 30 

Gin His Thr His Ser Arg Leu Gly Val Gly Val Gly Val Gly He Leu 
35 40 45 

Ser Asp Ala Ser Leu Ser Pro He Gin Gin Gly Ser Gly Gly His Ser 
50 55 60 



Gly Gly Gly Asn Thr Asn Ser Ser Pro Leu Ala 
65 70 75 



Pro Asn Gly Val Pro 
80 
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Leu Leu Thr Thr Met His Arg Ser Pro Asp Ser Pro Gin Pro Glu Leu 
85 90 95 



Ala Thr Met Thr Asn Val Asn Val Leu Asp Leu His Thr Asp Asn Ser 
100 105 110 



Lys Leu Tyr Asp Lys Glu Ala Val Phe lie Tyr Glu Thr Pro Lys Val 
115 120 125 



Val Met Pro Ala Asp Gly Gly Gly Gly Asn Asn Ser Asp Glu Gly His 
130 135 140 



Ala lie Asp Ala Arg lie Ala Ala Gin Met Gly Asn Gin Ala Gin Gin 
145 150 155 160 



Gin Gin Gin Gin Gin Gin Gin Thr Glu His Gin Pro Leu Ala Lys lie 
165 170 175 



Glu Phe Asp Glu Asn Gin lie lie Arg Val Val Gly Pro Asn Gly Glu 
180 185 190 



Gin Gin Gin lie lie Ser Arg Glu He lie Asn Gly Glu His His He 
195 200 205 



Leu Ser Arg Asn Glu Ala Gly Glu His lie Leu Thr Arg He Val Ser 
210 215 220 



Asp Pro Ser Lys Leu Met Pro Asn Asp Asn Ala Val Ala Thr Ala Met 
225 230 235 240 



Tyr Asn Gin Ala Gin Lys Met Asn Asn Asp His Gly Gin Ala Val Tyr 
245 250 255 



Gin Thr Ser Pro Leu Pro Leu Asp Ala Ser Val Leu His Tyr Ser Gly 
260 265 270 



Gly Asn Asp Ser Asn Val He Lys Thr Glu Ala Asp He Tyr Glu Asp 
275 280 285 



His Lys Lys His Ala Ala Ala Ala Ala Ala Ala Ala Gly Gly Gly * Ser 
290 295 300 



He He Tyr Thr Thr Ser Asp Pro Asn Gly Val Asn Val Lys Gin Leu 
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305 310 315 320 



Pro His Leu Thr Val Pro Gin Lys Leu Asp Pro Asp Leu Tyr Gin Ala 
325 330 335 



Asp Lys His He Asp Leu He Tyr Asn Asp Gly Ser Lys Thr Val He 
340 345 350 



Tyr Ser Thr Thr Asp Gin Lys Ser Leu Glu He Tyr Ser Gly Gly Asp 
355 360 365 



He Gly Ser Leu Val Ser Asp Gly Gin Val Val Val Gin Ala Gly Leu 
370 375 380 



Pro Tyr Ala Thr Thr Thr Gly Ala Gly Gly Gin Pro Val Tyr He Val 
385 390 395 400 



Ala Asp Gly Ala Leu Pro Ala Gly Val Glu Glu His Leu Gin Ser Gly 
405 410 415 



Lys Leu Asn Gly Gin Thr Thr Pro He Asp Val Ser Gly Leu Ser Gin 
420 425 430 



Asn Glu He Gin Gly Phe Leu Leu Gly Ser His Pro Ser Ser Ser Ala 
435 440 445 



Thr Val Ser Thr Thr Gly Val Val Ser Thr Thr Thr He Ser His His 
450 455 460 



Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
465 470 475 480 



Gin His Gin Gin Gin Gin Gin His Pro Gly Asp He Val Ser Ala Ala 
485 490 495 



Gly Val Gly Ser Thr Gly Ser He Val Ser Ser Ala Ala Gin Gin Gin 
500 505 510 



Gin Gin Gin Gin Leu He Ser He LysArg Glu Pro Glu Asp Leu Arg 
515 520 525 



Lys Asp Pro Lys Asn Gly Asn He Ala Gly Ala Ala Thr Ala Asn Gly 
530 535 540 
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Pro Gly Ser Val lie Thr Gin Lys lie Leu His Val Asp Ala Pro Thr 
545 550 555 560 



Ala Ser Glu Ala Asp Arg Pro Ser Thr Pro Ser Ser Ser lie Asn Ser 
565 570 575 



Thr Glu Asn Thr Glu Ser Asp Ser Gin Ser Val Ser Gly Ser Glu Ser 
580 585 590 



Gly Ser Pro Gly Ala Arg Thr Thr Ala Thr Leu Glu Met Tyr Ala Thr 
595 600 605 



Thr Gly Gly Thr Gin lie Tyr Leu Gin Thr Ser His Pro Ser Thr Ala 
610 615 620 



Ser Gly Ala Gly Gly Gly Ala Gly Pro Ala Gly Ala Ala Gly Gly Gly 
625 630 635 640 



Gly Val Ser Met Gin Ala Gin Ser Pro Ser Pro Gly Pro Tyr He Thr 
645 650 ~ 655 



Ala Asn Asp Tyr Gly Met Tyr Thr Ala Ser Arg Leu Pro Pro Gly Pro 
660 665 670 



Pro Pro Thr Ser Thr Thr Thr Phe He Ala Glu Pro Ser Tyr Tyr Arg 
675 680 685 



Glu Tyr Phe Ala Pro Asp Gly Gin Gly Gly Tyr Val Pro Ala Ser Thr 
690 695 700 



Arg Ser Leu Tyr Gly Asp Val Asp Val Ser Val Ser Gin Pro Gly Gly 
705 710 715 720 



Val Val Thr Tyr Glu Gly Arg Phe Ala Gly Ser Val Pro Pro Pro Ala 
725 730 735 



Thr Thr Thr Val Leu Thr Ser Val His His His Gin Gin Gin Gin Gin 
740 745 750 



Gin Gin Gin Gin His Gin Gin Gin Gin Gin Gin Gin Gin His His Gin 
755 760 765 
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Gin Gin Gin His His Ser Gin Asp Gly Lys Ser Asn Gly Gly Ala Thr 
770 775 780 

Pro Leu Tyr Ala Lys Ala lie Thr Ala Ala Gly Leu Thr Val Asp Leu 
785 " 790 795 800 



Pro Ser Pro Asp Ser Gly lie Gly Thr Asp Ala lie Thr Pro Arg Asp 
805 810 815 



Gin Thr Asn lie Gin Gin Ser Phe Asp Tyr Thr Glu Leu Cys Gin Pro 
820 825 830 



Gly Thr Leu lie Asp Ala Asn Gly Ser He Pro Val Ser Val Asn Ser 
835 840 845 



He Gin Gin Arg Thr Ala Val His Gly Ser Gin Asn Ser Pro Thr Thr 
850 855 .860 



Ser Leu Val Asp Thr Ser Thr Asn Gly Ser Thr Arg Ser Arg Pro Trp 
865 870 875 880 



His Asp Phe Gly Arg Gin Asn Asp Ala Asp Lys He Gin He Pro Lys 
885 890 895 



He Phe Thr Asn Val Gly Phe Arg Tyr His Leu Glu Ser Pro He Ser 
900 905 910 



Ser Ser Gin Arg Arg Glu Asp Asp Arg He Thr Tyr He Asn Lys Gly 
915 920 925 



Gin Phe Tyr Gly He Thr Leu Glu Tyr Val His Asp Ala Glu Lys Pro 
930 935 940 

He Lys Asn Thr Thr Val Lys Ser Val He Met Leu Met Phe Arg Glu 
945 950 955 960 

Glu Lys Ser Pro Glu Asp Glu He Lys Ala Trp Gin Phe Trp His Ser 
965 970 975 



Arg Gin His Ser Val Lys Gin Arg He Leu Asp Ala Asp Thr Lys Asn 
980 985 990 
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Ser Val Gly Leu Val Gly Cys lie Glu Glu Val Ser His Asn Ala lie 
995 1000 1005 



Ala Val Tyr Trp Asn Pro Leu Glu Ser Ser Ala Lys lie Asn lie 
1010 1015 1020 



Ala Val Gin Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Gly Gly 
1025 1030 1035 



Leu Pro Leu His Val Gin lie Asp Thr Phe Glu Asp Pro Arg Asp 
1040 1045 1050 



Thr Ala Val Phe His Arg Gly Tyr Cys Gin lie Lys Val Phe Cys 
1055 1060 1065 



Asp Lys Gly Ala Glu Arg Lys Thr Arg Asp Glu Glu Arg Arg Ala 
1070 1075 1080 



Ala Lys Arg Lys Met Thr Ala Thr Gly Arg Lys Lys Leu Asp Glu 
1085 1090 1095 



Leu Tyr His Pro Val Thr Asp Arg Ser Glu Phe Tyr Gly Met Gin 
1100 1105 1110 



Asp Phe Ala Lys Pro Pro Val Leu Phe Ser Pro Ala Glu Asp Met 
1115 1120 1125 



Glu Lys Val Gly Gin Leu Gly lie Gly Ala Ala Thr Gly Met Thr 
1130 1135 1140 



Phe Asn Pro Leu Ser Asn Gly Asn Ser Asn Ser Asn Ser His Ser 
1145 1150 1155 



Ser Leu Gin Ser Phe Tyr Gly His Glu Thr Asp Ser Pro Asp Leu 
1160 1165 1170 



Lys Gly Ala Ser Pro Phe Leu Leu His Gly Gin Lys Val Ala Thr 
1175 1180 1185 



Pro Thr Leu Lys Phe His Asn His Phe Pro Pro Asp Met Gin Thr 
1190 1195 1200 



Asp Lys Lys Asp His lie Leu Asp Gin Asn Met Leu Thr Ser Thr 
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1205 1210 1215 



Pro Leu Thr Asp Phe Gly Pro Pro Met Lys Arg Gly Arg Met Thr 
1220 1225 1230 



Pro Pro Thr Ser Glu Arg Val Met Leu Tyr Val Arg Gin Glu Asn 
1235 1240 1245 



Glu Glu Val Tyr Thr Pro Leu His Val Val Pro Pro Thr Thr He 
1250 1255 1260 



Gly Leu Leu Asn Ala He Glu Asn Lys Tyr Lys He Ser Thr Thr 
1265 1270 1275 



Ser He Asn Asn He Tyr Arg Thr Asn Lys Lys Gly He Thr Ala 
1280 1285 1290 



Lys He Asp Asp Asp Met He Ser Phe Tyr Cys Asn Glu Asp He 
1295 1300 1305 



Phe Leu Leu Glu Val Gin Gin He Glu Asp Asp Leu Tyr Asp Val 
1310 1315 1320 



Thr Leu Thr Glu Leu Pro Asn Gin 
1325 1330 



<210> 38 

<211> 5557 

<212> DNA 

<213> Drosophila 

<400> 38 

aaaaatagaa aaaacaacaa caaattggct tgaaaacgca aatgccaggc gcaacgcccc 60 

cgaaccgacc cgccccctca acttttgcgc cctccagtag caatagcagc aatatgagca 12 0 

gcagcaacat caaatgttag gccaaaatgc acaaaccgcc agcaacaaag gcagcaccaa 18 0 

gcgaacgaaa caacaacagc tccacatacc acaaagagtg gcacattaga agcggccaaa 24 0 

agcagccagc cgagagcatt gtgtaagcca aaggcccaga gagccaggct aaaagccccc 300 

agacgcacaa caacaacaac aacaactaaa acagcacaaa gagtggcgaa aggtgcaccc 360 

accagcaaaa cagcaacaac ggagcaacca acaacagcag cagcagcagc agcagccaca 42 0 

tttcagttac agctccagac tcccaggttg cagactccca aagcaaacag actccagtcc 480 
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acgatccagc 


tccagttcca 


ccgatccgat 


ccactgctcc 


agcgtgctcg 


agtgccatag 


540 


atcctcacca 


agtgccaaaa 


tccgcatcct 


gatcccaaga 


gctcaaggca 


ccccggccca 


600 


aaattgagct 


gagaacgaaa 


cgaaggaagt 


tccttagtgc 


catagaaagc 


agttaatgaa 


660 


acaacgacta 


agacgaagat 


cgaccatcca 


gaaccggagg 


gagctaattg 


cgaacgaaag 


720 


aaaccacaaa 


gtgccttcca 


tcaatccgtt 


gataagtgat 


atttattatg 


tttatacttg 


780 


ccagcagccg 


aggcagcaac 


agcaatagca 


acaaccatag 


gggatcacgg 


catcgatgat 


840 


cagtccacga 


ccaagtccta 


gtgcaatccg gaatccagtt 


caaattagtt 


caataagccg 


900 


tatctaccac 


gtataatgtc 


cacatccacc 


gccacaacga 


gcgttatcac 


gtccaacgag 


960 


ctctcgctgt 


ccggccacgc 


ccacggtcac 


ggtcacgccc 


accagttgca 


ccagcacacc 


1020 


cacagccgcc 


taggagttgg 


cgttggtgtt 


ggcatcctta 


gcgacgcatc 


cctatcgccc 


1080 


atccaacaag 


gcagtggcgg 


ccacagcggc 


ggaggtaaca 


caaacagttc 


accactggcg 


1140 


cccaacggag 


tgccacttct 


cacaacaatg 


caccgatcac 


cggactcacc 


gcagccagaa 


1200 


ttggccacca 


tgacgaacgt 


caacgtgctg 


gatctgcaca 


cggataactc 


caagctgtac 


1260 


gacaaggagg 


ctgtatttat 


atacgaaacg 


cccaaggtgg 


tgatgccagc 


ggatggcggg 


1320 


ggtggcaata 


attccgatga 


aggtcatgcc 


atcgatgcgc 


ggattgcggc 


ccaaatgggc 


1380 


aaccaagccc 


agcaacagca 


gcagcagcaa 


cagcagacgg 


aacaccagcc 


gctggccaag 


1440 


atcgagttcg 


atgagaacca 


gataatccgg 


gtggtgggac 


caaatggcga 


gcaacagcaa 


1500 


atcatctcgc 


gggagatcat 


caatggggag 


catcatatcc 


tgtcgcgaaa 


cgaggctggt 


1560 


gagcacattc 


tcacacggat 


cgtcagtgat 


ccctccaagt 


tgatgcccaa 


tgacaatgca 


1620 


gtggccacgg 


ccatgtacaa 


ccaggcccaa 


aagatgaaca 


atgatcacgg 


gcaggcggta 


1680 


tatcagacat 


caccattgcc 


gctagacgcg 


tctgtattgc 


attatagtgg 


cggcaatgat 


1740 


tcgaatgtaa 


ttaagacgga 


ggccgatatc 


tacgaggatc 


acaagaaaca 


tgcggctgca 


1800 


gcagcagctg 


ctgccggcgg 


aggatccatc 


atatacacca 


catccgatcc 


gaacggagtg 


1860 


aatgtgaaac 


aactgcccca 


tttgacggta 


ccccaaaaac 


ttgatcccga 


cctctatcaa 


1920 


gccgataagc 


atatagattt 


gatctacaac 


gatggcagca 


agacggtgat 


ttactccact 


1980 


acggatcaga 


agagtttgga 


aatatactcg 


ggcggcgaca 


tcggcagcct 


ggtgtccgac 


2040 


ggccaagtgg 


tggtccaggc 


gggactgccg 


tatgccacca 


ccaccggagc 


cggcggccag 


2100 


cccgtctata 


tcgtggccga 


cggtgccttg 


ccagcgggag 


tcgaggagca 


tctgcagagt 


2160. 


ggaaagctca 


atggccagac 


cacacctatc 


gatgtctctg 


gcctatcgca 


aaatgagatt 


2220 
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caaggcttfcfc 


tgctcggctc 


acacccctcg 


tcatcggcga 


eggtaagcac 


aaceggegtt 


2280 


gtctccacga 


caacgatctc 


gcatcaccag 


caacagcagc 


agcagcagca 


acagcaacag 


2340 


cagcagcagc 


agcagcaaca 


ccagcagcag 


cagcaacatc 


ccggcgacat 


tgttagtgcc 


2400 


qc t aacataa 

SP*" u 33^3 *-S3S3 


qq acr c a c Qcrcr 


ctccafctgtc 


tcctctgcgg 


cgcaacagca 


acaacaacaa 


2460 


caac taatta 


gcatcaaacg 


aqaqcccQaa 


qacttacaca 


aggatcccaa 


gaatggcaac 


2520 


afctgccggtg 


cagcaacagc 


aaatggaccc 


ggttcggtca 


taacccaaaa 


gatcttgeae 


2580 


gtggatgcac 


caacggcaag 


tgaagctgat 


aggcccagca 


cacccagcag 


cagcat caac 


2640 


agcactgaaa 


acactgaatc 


ggactcacag 


tcagtatcag 


gatcagaat c 


aaaat caeca 


2700 


Qqaqccaqcra 


ccacagccac 


acfcagagatg ' 


tatgcaacca 


cqqqcqqcac 


acagatctat 


2760 


ctacagacct 


cacatcccag 


cacaacaaqc 


era acr c Q qq c q 

-3 S3 ""*S3 v 333 v 3 


Qccfacaccaa 

-3 -J-3 -3^ w -3-3 


acccgctgga 


2820 


ac caccaaca 


acaacaatat 

3^33^33 «-S3 *— 


atccatacaa 


gcgcaaagtc 


ccaatccaaa 


tccctatatc 


2880 


acaaccaafca 


actatggcat 


gt acacggcc 


agtcgcctgc 


cacccggt cc 


cccacccacc 


2940 


aacaccacca 


catttataac 


ggagCCCtCC 


tactatcaaa 


aatactttac 


accaoataac 


3000 


caaaataact 

v»» d. a. ^ * — ^ u. 


atatcrcccrac 


cagcacgagg 


tctttafcata 


acaacahaaa 

3^*3 CTV 'zl t 'S3Si a ' 


cat*atccata 

V- ^-H V— U L- V— V— ' S*^ Im> d 


3060 


tctcaaccccr 


acaaaataafc 


cacctatgag 


aaccacttta 


ccaacaacat 


tcccccaccc 


3120 


accaccacca 


ccgtgctaac 


cagcgtgcat 


caccaccagc 


aacagcagca 


acaacaacaa 


3180 


caac at caac 


agcagcagca 


gcagcaacag 


caccaccagc 


agcaacagca 


ccattcacaa 


3240 


gatggcaaga 


acaataacaa 


agcaacgcca 


cfcctarfcgcca 


aagecat t ac 


z3S3 v *'Sd3^S3S3S3 u * 


3300 


ctaacaataa 


atttgccaag 


tccggattcg 


qqcattqqta 


eggatgecat 


tacaccacaa 


3360 


gatcagacaa 


atatccaaca 


gtcctttgat 


tataeggaat 


tgtgccagcc 


aaacacacta 


3420 


atcgatgcca 


atggcagcat 


acccgtcagc 


ataaacaaca 


tccagcagag 


aacaacaatc 


34 8 0 


catggcagcc 


agaacagtcc 


caccacatcg 


ctggtggaca 


ccagcaccaa 


t99atccacg 


3540 


cgatcgcggc 


cctggcacga 


ctttggacgt 


cagaatgatg 


ccgacaaaat 


acaaatacca 


3600 


aaaatcttca 


caaacgtggg 


cttccgatat 


cacctggaga 


gccccatcag 


ttcatcgcag 


3660 


aggcgcgagg 


acgatcgcat 


cacctacatc 


aacaagggtc 


aattctatgg 


aataacgctg 


3720 


gagtatgtgc 


acgatgcgga 


aaagcccatt 


aagaacacca 


cegtcaagag 


tgtgatcatg 


3780 


ctaatgttcc 


gcgaggagaa gagtcccgag 


gatgagatca 


aggectggea 


attctggcac 


3840 


agtcgtcagc 


attccgtgaa 


gcagagaatc 


ttggatgcag 


atacgaagaa 


ctcggttggc 


3900 
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4440 
4500 
4560 



ctcgttggct gcatcgagga agtgtcgcac aatgccatcg ccgtctactg gaatccgctg 3960 

gagagctccg ccaagatcaa cattgcggtt cagtgcttga gcacggattt cagcagtcaa 4020 

aagggaggcc tgccgctgca cgtacaaatc gacacatttg aggaccccag agatacggcg 4080 

gtcttccacc gcggctactg tcagataaag gtcttctgcg ataagggcgc cgaacgaaag 4140 

acgcgcgatg aagagcggcg ggccgccaaa cgaaagatga cagccacggg cagaaagaag 4200 

ctggacgagc tttaccatcc ggtaacggat cggtccgagt tctatggcat gcaggacttc 4260 

gccaagccgc cggt gctatt ctcgcccgcc gaggacatgg agaagagctt ctacggccat 4320 

gagactgact cgccggacct gaagggggcc tcaccgttcc tgctccacgg ccagaaggtg 4380 
gccacgccga cgctcaagtt ccacaaccat tttccgcccg acatgcagac cgataagaag 
gatcacatac tggaccagaa catgttgacc agcacacccc tgaccgactt tggtccgccg 
atgaagcgcg gcaggatgac gccgccgacc tcggaacgcg tgatgctgta cgtgcggcag 

gagaacgagg aggtgtatac accgttgcac gtggtgccgc ccaccacgat cggcctgcta 4 620 

aatgcgattg aaaacaaata caaaatctca acaacgagca taaataacat ttatcgcaca 4680 
aacaagaagg ggattactgc gaaaattgac gatgacatga tatcgttcta ctgcaacgag 
gacatctttc tgctggaggt gcaacagatc gaggacgacc tgtacgatgt gacgctcacg 
gagctgccca atcagtagcg ctggcagtac gggtagcacc cgctaaccgc actcaaaaaa 
aaaagcaaac aaacacacaa attacggaca caacaagttg tttcaataag ccattttcca 

tagagcctaa gtctaaatat cgtagttata ataatgggat ccgcaacaaa tcgagttgca 4 980 

aogaatgtta agaacgctaa cacaatacgc atgtaaaatg atactttaaa attgatttag 5 040 

ttattttagc aacaatgaga ttatctaaaa ttgtttgatc aaattttaca ttctcgctat 5100 

gtctatagat aattctaagc ccgtaagccc ataagcgtaa tcgtaatcgt aatcgtaccg 5160 

tgtatttatg ctcatatata aacaactata tatatatata tatatatata tatgtgcgga 5220 

gtgcaacagt gtctgtccag taggagataa gtctcgtttc cgctcccctg cttatgctat 52 80 

gaccttaggt ccagggcaag tatgagttac cgaatctatc tattaggtgc atctaacgaa 5340 

aggaatcatt agctctgcac gaactctagc cgtagcctat tgtaatccat ttgtatgttt 5400 

ggcttaagcg ttttacttgt tgaatataaa gtgtaaaatt atttttgaaa aaaaaaaacc 5460 
cacacaaaac acaaatcgtt tgttctatat ttctgtttca aaactaactc gttacccaca 
atcccctctg ttatgtataa ttaggatctc tgtacac 



4740 
4800 
4860 
4920 
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<210> 39 

<211> 1331 

<212> PRT 

<213> Drosophila 

<400> 39 

Met Ser Thr Ser Thr Ala Thr Thr Ser Val He Thr Ser Asn Glu Leu 
15 10 15 



Ser Leu Ser Gly His Ala His Gly His Gly His Ala His Gin Leu His 
20 25 30 

Gin His Thr His Ser Arg Leu Gly Val Gly Val Gly Val Gly He Leu 
35 40 45 



Ser Asp Ala Ser Leu Ser Pro He Gin Gin Gly Ser Gly Gly His Ser 
50 55 60 



Gly Gly Gly Asn Thr Asn Ser Ser Pro Leu Ala Pro Asn Gly Val Pro 
65 70 75 80 



Leu Leu Thr Thr Met His Arg Ser Pro Asp Ser Pro Gin Pro Glu Leu 
85 90 95 



Ala Thr Met Thr Asn Val Asn Val Leu Asp Leu His Thr Asp Asn Ser 
100 105 no 



Lys Leu Tyr Asp Lys Glu Ala Val Phe He Tyr Glu Thr Pro Lys Val 
115 120 125 



Val Met Pro Ala Asp Gly Gly Gly Gly Asn Asn Ser Asp Glu Gly His 
13 0 135 14 0 



Ala He Asp Ala Arg He Ala Ala Gin Met Gly Asn Gin Ala Gin Gin 
145 150 155 160 

Gin Gin Gin Gin Gin Gin Gin Thr Glu His Gin Pro Leu Ala Lys He 
165 170 175 

Glu Phe Asp Glu Asn Gin lie He Arg Val Val Gly Pro Asn Gly Glu 
180 185 190 



Gin Gin Gin He He Ser Arg Glu He He Asn Gly Glu His His He 
195 200 205 
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Leu Ser Arg Asn Glu Ala Gly Glu His lie Leu Thr Arg lie Val Ser 
210 215 220 



Asp Pro Ser Lys Leu Met Pro Asn Asp Asn Ala Val Ala Thr Ala Met 
225 230 235 240 



Tyr Asn Gin Ala Gin Lys Met Asn Asn Asp His Gly Gin Ala Val Tyr 
245 250 255 

Gin Thr Ser Pro Leu Pro Leu Asp Ala Ser Val Leu His Tyr Ser Gly 
260 265 270 



Gly Asn Asp Ser Asn Val lie Lys Thr Glu Ala Asp lie Tyr Glu Asp 
275 280 285 



His Lys Lys His Ala Ala Ala Ala Ala Ala Ala Ala Gly Gly Gly Ser 
290 295 300 



lie lie Tyr Thr Thr Ser Asp Pro Asn Gly Val Asn Val Lys Gin Leu 
305 310 315 " 320 

Pro His Leu Thr Val Pro Gin Lys Leu Asp Pro Asp Leu Tyr Gin Ala 
325 330 335 



Asp Lys His He Asp Leu He Tyr Asn Asp Gly Ser Lys Thr Val He 
340 345 350 



Tyr Ser Thr Thr Asp Gin Lys Ser Leu Glu He Tyr Ser Gly Gly Asp 

355 360 365 

He Gly Ser Leu Val Ser Asp Gly Gin Val Val Val Gin Ala Gly Leu 
370 375 380 



Pro Tyr Ala Thr Thr Thr Gly Ala Gly Gly Gin Pro Val Tyr He Val 
385 390 395 * 400 

Ala Asp Gly Ala Leu Pro Ala Gly Val Glu Glu His Leu Gin Ser Gly 
405 410 415 



Lys Leu Asn Gly Gin Thr Thr Pro He Asp Val Ser Gly Leu Ser Gin 
420 425 430 
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Asn Glu lie Gin Gly Phe Leu Leu Gly Ser His Pro Ser Ser Ser Ala 
435 440 445 



Thr Val Ser Thr Thr Gly Val Val Ser Thr Thr Thr He Ser His His 
450 455 460 



Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
465 470 475 480 



Gin His Gin Gin Gin Gin Gin His Pro Gly Asp He Val Ser Ala Ala 
485 490 495 



Gly Val Gly Ser Thr Gly Ser He Val Ser Ser Ala Ala Gin Gin Gin 
500 505 510 



Gin Gin Gin Gin Leu He Ser He Lys Arg Glu Pro Glu Asp Leu Arg 
515 520 525 



Lys Asp Pro Lys Asn Gly Asn He Ala Gly Ala Ala Thr Ala Asn Gly 
530 535 540 



Pro Gly Ser Val He Thr Gin Lys He Leu His Val Asp Ala Pro Thr 
545 550 555 560 



Ala Ser Glu Ala Asp Arg Pro Ser Thr Pro Ser Ser Ser He Asn Ser 
565 570 575 



Thr Glu Asn Thr Glu Ser Asp Ser Gin Ser Val Ser Gly Ser Glu Ser 
580 585 590 



Gly Ser Pro Gly Ala Arg Thr Thr Ala Thr Leu Glu Met Tyr Ala Thr 
595 600 605 



Thr Gly Gly Thr Gin He Tyr Leu Gin Thr Ser His Pro Ser Thr Ala 
610 615 620 



Ser Gly Ala Gly Gly Gly Ala Gly Pro Ala Gly Ala Ala Gly Gly Gly 
625 630 635 640 



Gly Val Ser Met Gin Ala Gin Ser Pro Ser Pro Gly Pro Tyr He Thr 
645 650 655 
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Ala Asn Asp Tyr Gly Met Tyr Thr Ala Ser Arg Leu Pro Pro Gly Pro 
660 665 670 



Pro Pro Thr Ser Thr Thr Thr Phe He Ala Glu Pro Ser Tyr Tyr Arg 
675 680 685 



Glu Tyr Phe Ala Pro Asp Gly Gin Gly Gly Tyr Val Pro Ala Ser Thr 
690 . 695 700 



Arg Ser Leu Tyr Gly Asp Val Asp Val Ser Val Ser Gin Pro Gly Gly 
705 710 715 ~ 720 



Val Val Thr Tyr Glu Gly Arg Phe Ala Gly Ser Val Pro Pro Pro Ala 
725 730 735 



Thr Thr Thr Val Leu Thr Ser Val His His His Gin Gin Gin Gin Gin 
740 745 750 



Gin Gin Gin Gin His Gin Gin Gin Gin Gin Gin Gin Gin His His Gin 
755 760 765 



Gin Gin Gin His His Ser Gin Asp Gly Lys Ser Asn Gly Gly Ala Thr 
770 775 780 



Pro Leu Tyr Ala Lys Ala He Thr Ala Ala Gly Leu Thr Val Asp Leu 
785 790 795 800 



Pro Ser Pro Asp Ser Gly He Gly Thr Asp Ala He Thr Pro Arg Asp 
805 810 815 



Gin Thr Asn He Gin Gin Ser Phe Asp Tyr Thr Glu Leu Cys Gin Pro 
820 825 830 



Gly Thr Leu He Asp Ala Asn Gly Ser He Pro Val Ser Val Asn Ser 
835 840 845 



He Gin Gin Arg Thr Ala Val His Gly Ser Gin Asn Ser Pro Thr Thr 
850 855 860 



Ser Leu Val Asp Thr Ser Thr Asn Gly Ser Thr Arg Ser Arg Pro Trp 
865 870 875 880 



His Asp Phe Gly Arg Gin Asn Asp Ala Asp Lys He Gin He Pro Lys 
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885 890 895 

lie Phe Thr Asn Val Gly Phe Arg Tyr His Leu Glu Ser Pro He Ser 
500 905 910 

Ser Ser Gin Arg Arg Glu Asp Asp Arg He Thr Tyr He Asn Lys Gly 
915 920 925 

Gin Phe Tyr Gly He Thr Leu Glu Tyr Val His Asp Ala Glu Lys Pro 
930 935 940 

He Lys Asn Thr Thr Val Lys Ser Val He Met Leu Met Phe Arg Glu 
945 ^50 955 960 

Glu Lys Ser Pro Glu Asp Glu He Lys Ala Trp Gin Phe Trp His Ser 
565 970 975 

Arg Gin His Ser Val Lys Gin Arg He Leu Asp Ala Asp Thr Lys Asn 
580 985 990 

Ser Val Gly Leu Val Gly Cys He Glu Glu Val Ser His Asn Ala He 
555 1000 1005 

Ala Val Tyr Trp Asn Pro Leu Glu Ser Ser Ala Lys He Asn He 
1010 1015 1020 

Ala Val Gin Cys Leu Ser Thr Asp Phe Ser Ser Gin Lys Glv Glv 
1025 1030 1035 

Leu Pro Leu His Val Gin He Asp Thr Phe Glu Asp Pro Arg Asp 
1040 1045 1050 

Thr Ala Val Phe His Arg Gly Tyr Cys Gin He Lys Val Phe Cys 
1° 55 1060 1065 

Asp Lys Gly Ala Glu Arg Lys Thr Arg Asp Glu Glu Arg Arg Ala 
1° 70 1075 1080 

Ala Lys Arg Lys Met Thr Ala Thr Gly Arg Lys Lys Leu Asp Glu 
1085 1090 1095 



Leu Tyr His * Pro Val Thr Asp Arg Ser Glu Phe Tyr Gly Met Gin 
1100 HQS mo 
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Asp Phe Ala Lys Pro Pro Val Leu Phe Ser Pro Ala Glu Asp Met 
HIS 1120 ii25 

Glu Lys val Gly Gin Leu Gly He Gly Ala Ala Thr Gly Met Thr 
113° 1135 H40 

Phe Asn Pro Leu Ser Asn Gly Asn Ser Asn Ser Asn Ser His Ser 
I 145 1150 H55 

Ser Leu Gin Ser Phe Tyr Gly His Glu Thr Asp Ser Pro Asp Leu 
ll 60 1165 H70 

Lys Gly Ala Ser Pro Phe Leu Leu His Gly Gin Lys Val Ala Thr 
H75 iiso 1185 

Pro Thr Leu Lys Phe His Asn His Phe Pro Pro Asp Met Gin Thr 
II 90 1195 1200 

Asp Lys Lys Asp His He Leu Asp Gin Asn Met Leu Thr Ser Thr 
1205 1210 1215 

Pro Leu Thr Asp Phe Gly Pro Pro Met Lys Arg Gly Arg Met Thr 
I 220 1225 1230 

Pro Pro Thr Ser Glu Arg Val Met Leu Tyr Val Arg Gin Glu Asn 
1235 1240 1245 

Glu Glu Val Tyr Thr Pro Leu His Val Val Pro Pro Thr Thr He 
1250 1255 1260 

Gly Leu Leu Asn Ala He Glu Asn Lys Tyr Lys He Ser Thr Thr 
1265 1270 12 75 

Ser He Asn Asn He Tyr Arg Thr Asn Lys Lys Gly He Thr Ala 
1280 1285 1290 

Lys He Asp Asp Asp Met He Ser Phe Tyr Cys Asn Glu Asp He 
1295 1300 1305 

Phe Leu Leu Glu Val Gin Gin He Glu Asp Asp Leu Tyr Asp Val 
1310 i3i 5 1320 
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Thr Leu Thr Glu Leu Pro Asn Gin 



<210> 40 

<211> 18 

<212> DNA 

<213> murine 

<400> 40 

ggatcagaag accatgcc 

<210> 41 

<211> 18 

<212> DNA 

<213> murine 

<400> 41 

aggctgttag agttggtg 

<210> 42 

<211> 18 

<212> DNA 

<213> murine 

<400> 42 

ctgtagccag ctttcatc 

<210> 43 

<211> 19 

<212> DNA 

<213> murine 

<400> 43 

gctggtgaaa aggacctct 

<210> 44 

<211> 20 

<212> DNA 

<213> murine 

<400> 44 

cacaggacta gaacacctgc 

<210> 45 

<211> 17 

<212> DNA 

<213> murine 



1325 



1330 



<400> 45 

cacattgaag aggtggc 



17 
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<210> 46 

<211> 20 

<212> DNA 

<213> MURINE 



<400> 46 

aagggtgagc aggttcgctt 



20 
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